Filter breakers

David Relson relson at osagesoftware.com
Sun Feb 10 06:06:27 CET 2008


On Sun, 10 Feb 2008 10:45:50 +1030
Stephen Davies wrote:

> I have been using bogofilter (currently 1.1.5) with Amavis and milter
> for a number of years now with great success.
> Every day, several hundred spams are successfully detected and zero
> hams get filtered.
> However, recently I have been having quite a number of spams getting
> through to my inbox.
> They all seem to be very small messages in three or four categories
> but despite my rerunning each case with bogofilter -Ns at least four
> times, they keep on getting through.
> 
> The only change I have made to my configuration recently is to start 
> registering hams. Previously, I only registered spams.
> 
> My database is some 200Mb.
> 
> Any ideas as to what is happening here?
> 
> TIA,
> Stephen Davies

H'lo Stephen,

Bogofilter needs a mixture of ham and spam to be effective.  Training
with both ham and spam provides info needed to decide if a new message
is one or the other.  The general wisdom is that there needs to be a
reasonable ratio between the number of spam messages and the number of
ham messages used in training.  Training only with spam skews the
wordlist and is not recommended.  

FWIW, my wordlist has accumulated tokens from approx 762000 spam and
156000 ham.  Bogofilter is currently processing approx 2000 spam per
day and less than 1 a day is classified as unsure.

Bottom line, train more ham.

HTH,

David



More information about the Bogofilter mailing list