bogofilter-tuning.HOWTO

Tom Anderson tanderso at oac-design.com
Sun Feb 1 19:56:57 CET 2004


The bogofilter-tuning.HOWTO file appears to require updating.  It should
assume the use of a single wordlist.db instead of seperate files, and
also it need not talk about the "other" classification methods.  

Also, this tutorial recommends a robs value >= 0.01, however, the
default in bogofilter.cf appears to be 0.001, which is specifically
warned against in the tuning howto.  The default min_dev also appears to
be artificially low compared to the recommended value here.

Moreover, I'm fairly certain that most people receive many more spams
than hams, and I'm concerned about the verbiage recommending nearly
equal numbers in the list.  Trying to maintain such an equilibrium is
quite a time-intensive process and probably unfeasible for most people. 
My list contains 13598 spams to 4278 hams, or roughly 3:1, but I don't
seem to suffer any ill effects.  In fact, I don't receive any false
positives at all, ever, and only 5-8 false negatives per diem (cutoffs
at 0.25, 0.65).  So where does this equilibrium recommendation stem
from?

Finally, I would suggest that the warning about requiring constant
updating in the -u mode be amended to consider the flip side.  Without
training, as spam tactics mutate over time, your database may become as
equally unusable.  So, in either case, you still have to train
consistently.  Using -u permits you to save time by only training on the
mistakes instead of on every email that arrives.

Greg, would you be interested, as the original author, in revising this
file at all?

Tom

P.S. Bogofilter rocks!  Those 13598 spams have been only since October
when I started a new database from scratch.  Assuming manually checking
for 1-2 seconds each of those spams without bogofilter, versus the 1-2
seconds per dozen or two in my spam box for false positives now using
bogofilter, I've saved around 4 hours of time.  That's about 1 hour per
month.  In the course of a year, that means I get 1.5 8-hour work days
of bonus vacation because of bogofilter!
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://www.bogofilter.org/pipermail/bogofilter/attachments/20040201/5091c806/attachment.sig>


More information about the Bogofilter mailing list