still problem in spam management

Thomas Anderson tanderson at orderamidchaos.com
Tue Apr 5 23:29:39 CEST 2011


On 4/4/2011 12:40 PM, Stéphane Guedon wrote:
> I use bogfilter since months now, and I still have a problem of classification
> between ham and spam.
> Half of my spam is still classified as ham.

It sounds like you need to adjust your configuration values.  Here are mine:

robx=0.69, robs=0.33, min_dev=0.2, spam_cutoff=0.7, ham_cutoff=0.3

You might want to only adjust yours a little bit and see how it affects 
things.  If you get too many false negatives, then adjust your spam 
cutoff a little lower.  I think the default is like 0.99, which is way 
too high.  Try 0.90.  If you don't get any false positives, you can try 
moving it down a little more.  You can also try moving your robx up a 
little.  This will cause words to be a little more "suspicious" the 
first time they're seen.  Just don't set robx higher than spam_cutoff. 
Also, if you're not currently using tri-state classification, then you 
should do so by setting the ham_cutoff > 0.  This will give you "ham", 
"spam", and "unsure", which makes it easier to receive and respond to 
actual hams while later looking through the unsures to separate out any 
hams from spams before training on them.

Tom




More information about the Bogofilter mailing list