many spams suddenly getting through

Tom Anderson tanderso at oac-design.com
Mon Sep 13 20:45:45 CEST 2004


From: ".rp" <printer at moveupdate.com>
> Might just be a fluke. One weekend we had this problem when a certain
> word was used by the spammers in their "random words" designed to
> overwhelm baysian filters actually did overwhelm ours. Fortunately BF
> started catching on and also the spammers moved on to other words.

Yeah, I'd say just try sending bogofilter the corrections and see if it 
comes back down to previous levels again.  I sometimes get days where lots 
of spam get through, but sending the corrections kills them off quickly. 
Spammers really cannot adapt as quickly as bogofilter can.  That's what 
makes a statistical filter so much better than a heuristic one.

If you continue to have an influx of false negatives, but still have no 
false positives, try changing your cutoffs.  I've found that I've generally 
had to lower my spam cutoff over time.  As your database grows, your hammy 
tokens remain relatively tight, and strengthen over time, but your spammy 
tokens continue to grow in number and get diluted.  This is why hams are 
usually very certain (under 0.01) for me, but spams are relatively uncertain 
(down to about 0.1 sometimes, up through 1.0).  Bogofilter, for me at least, 
is more of a ham filter than a spam filter... it identifies the hams fairly 
well, and except for a small "unsure" buffer up to the robx value, 
everything else is considered spam.

Make note of the highest scoring ham, and set your spam cutoff above that to 
prevent false positives.  Lower the spam cutoff over time.  You can also 
lower your ham cutoff over time so that some of your hams end up in your 
unsure box, and you can send them as corrections to further strengthen the 
ham classifications, which will then allow you to lower your spam cutoff 
some more.

Tom




More information about the Bogofilter mailing list