no filtering without ham
tanderson at orderamidchaos.com
Wed Jan 21 14:14:50 UTC 2009
On Sun, 2009-01-18 at 04:20 +0100, Matthias Andree wrote:
> a - let kmail use bogofilter in three-state mode (ham/spam/unsure) -
> that's the default anyways
> b - then have kmail file unsure messages into an unsure folder
This is what I usually do when I set up a new user on bogofilter. It
makes the most sense to them, I think. Everything comes in as unsure
until they train them one way or the other.
> > - Kmail could pipe some bogus mails through bogofilter as ham, to feed the
> > wordlist.
This isn't a terrible idea. For some users who don't really want to
train much, I'll set them up with a default wordlist (usually my own),
which right off the bat usually gives them 99% accuracy. The minor
differences between their ham/spam and mine they simply have to train
on. Kmail could start off with a very, very simple wordlist containing
very common hammy words (such as "kmail", "the", etc). E.g., they could
train their welcome message as a ham. That would at least get the
process started and would not really "pollute" the database, as those
tokens would probably become neutral soon anyway.
> > - bogofilter could start working after it was trained with spam emails only
> > and get the ham stats after that, when users mark false positives.
If this is possible, I'd recommend this approach too.
More information about the Bogofilter