best practices question

Jeremy Blosser jblosser-bogofilter at firinn.org
Fri Sep 20 22:37:59 CEST 2002


On Sep 20, David Relson [relson at osagesoftware.com] wrote:
> I am aware of two ways of training and using bogofilter.
> 
> 1 - Create good and spam word lists (using the '-h' and '-s' options).  Let 
> bogofilter classify messages.  For incorrectly classified messages, feed 
> them into the word lists (again using the '-h' and '-s' options).
> 
> 2 - Create word lists (as above).  When a message is classified as spam, 
> automatically merge it into the word list (using '-s').  This will expand 
> the spam list by including words that have "appeared in a spam 
> context".  For incorrectly classified messages, use the '-H' and '-S' 
> options so that probabilities will shift from the wrong answer to the right 
> answer.
> 
> What do y'all think is the best practice for handling word list updating?

Definitely the second.  It's the only way to have a complete view of your
mail, and it allows bogofilter to automatically retrain itself over time to
adapt to the nature of spam as it [d]evolves.

The existence of the -H and -S options is one of the better features of
bogofilter and this approach to fighting spam, IMO.

For summay digest subscription: bogofilter-digest-subscribe at aotto.com



More information about the Bogofilter mailing list