spam cutoff less than neutral?
David Relson
relson at osagesoftware.com
Mon Feb 23 15:29:20 CET 2004
On 23 Feb 2004 09:19:27 -0500
Tom Anderson wrote:
> robx=0.455
> robs=0.1
> min_dev=0.25
> spam_cutoff=0.55
> ham_cutoff=0.15
>
> I'm receiving zero false positives (awesome). I receive on average
> maybe one false negative per day (pretty good). Upwards of 60 spams
> are successfully classified (nice). But, I get 20-30 unsures which
> are almost always (>99%) spam. I use -u, and correct all errors and
> unsures. Many unsures are of the type I previously identified on this
> list... they are very long and contain many normal English words,
> generally scoring 0.5 and not moving much from that after repeated
> training.
>
> Given these numbers, I'm tempted to move my spam_cutoff even further
> down. However, since 0.5 should theoretically be "unsure", I'm
> hesitant to move the spam_cutoff much further due to the philosophical
> implications. This is particularly true if I move spam_cutoff too
> close to robx. False positives are unacceptable, and heretofore
> unseen emails need the benefit of the doubt. Already my spam_cutoff
> is less than min_dev, which itself seems somewhat hypocritical.
>
> Should I keep my spam_cutoff as is and just continue correcting
> unsures? Or is it safe to move it into "unsure" territory? Does
> anyone else have a very low spam_cutoff? Does it produce any false
> positives? Can tweaking the other numbers push more of these unsures
> into the spam territory without moving spam_cutoff?
>
> Tom
Tom,
Divide your unsures into two groups - ham and spam. Look at the highest
scoring ham message. To avoid false positives, your spam_cutoff needs
to be higher than that message's score. Similarly you can raise your
ham_cutoff value and lower the unsure count.
Based on bogotune results, I'm presently using
ham_cutoff=0.376
spam_cutoff=0.501
I'm getting several unsures a day. Compared to several hundred spam
daily, it's not a big deal.
David
More information about the Bogofilter
mailing list