still problem in spam management
Thomas Anderson
tanderson at orderamidchaos.com
Tue Apr 5 23:29:39 CEST 2011
On 4/4/2011 12:40 PM, Stéphane Guedon wrote:
> I use bogfilter since months now, and I still have a problem of classification
> between ham and spam.
> Half of my spam is still classified as ham.
It sounds like you need to adjust your configuration values. Here are mine:
robx=0.69, robs=0.33, min_dev=0.2, spam_cutoff=0.7, ham_cutoff=0.3
You might want to only adjust yours a little bit and see how it affects
things. If you get too many false negatives, then adjust your spam
cutoff a little lower. I think the default is like 0.99, which is way
too high. Try 0.90. If you don't get any false positives, you can try
moving it down a little more. You can also try moving your robx up a
little. This will cause words to be a little more "suspicious" the
first time they're seen. Just don't set robx higher than spam_cutoff.
Also, if you're not currently using tri-state classification, then you
should do so by setting the ham_cutoff > 0. This will give you "ham",
"spam", and "unsure", which makes it easier to receive and respond to
actual hams while later looking through the unsures to separate out any
hams from spams before training on them.
Tom
More information about the Bogofilter
mailing list