no filtering without ham

Thomas Anderson tanderson at orderamidchaos.com
Wed Jan 21 15:14:50 CET 2009


On Sun, 2009-01-18 at 04:20 +0100, Matthias Andree wrote:
> a - let kmail use bogofilter in three-state mode (ham/spam/unsure) -
>   that's the default anyways
> 
> b - then have kmail file unsure messages into an unsure folder

This is what I usually do when I set up a new user on bogofilter.  It
makes the most sense to them, I think.  Everything comes in as unsure
until they train them one way or the other.

> > - Kmail could pipe some bogus mails through bogofilter as ham, to feed the 
> > wordlist.

This isn't a terrible idea.  For some users who don't really want to
train much, I'll set them up with a default wordlist (usually my own),
which right off the bat usually gives them 99% accuracy.  The minor
differences between their ham/spam and mine they simply have to train
on.  Kmail could start off with a very, very simple wordlist containing
very common hammy words (such as "kmail", "the", etc).  E.g., they could
train their welcome message as a ham.  That would at least get the
process started and would not really "pollute" the database, as those
tokens would probably become neutral soon anyway.

> > - bogofilter could start working after it was trained with spam emails only 
> > and get the ham stats after that, when users mark false positives.

If this is possible, I'd recommend this approach too.

Tom






More information about the Bogofilter mailing list