article on blocking by subnets - Justification
David Relson
relson at osagesoftware.com
Thu Dec 5 21:06:00 CET 2002
Barry,
Lots of data. Lots of fun :-)
Here's my thought on how to determine whether subnets provide useful info,
i.e. help classification.
First, take a month's messages and separate spam from ham.
Phase 1: run script contrib/randomtrain. Afterwards display MSG_COUNT
from spamlist.db and goodlist.db to determine how many messages were
mis-classified, hence trained on.
Phase 2: turn on blocking_by_subnets and rerun phase 1.
Are the counts different? Since the counts indicate how many messages
bogofilter got wrong, the counts should go down when bogofilter has better
data for classifying.
Notes: Be sure to use new wordlists for each run. The newest cvs versions
of bogofilter allow "block_on_subnet=Yes" to be put into the config file,
which makes testing easier.
David
More information about the Bogofilter
mailing list