more test results
David Relson
relson at osagesoftware.com
Thu Feb 13 19:38:04 CET 2003
Greetings,
More results of efficacy testing to measure the value of 3 bogofilter
config options:
asc - replace-non-ascii
net - block-on-subnets
tag - tag-header-lines
Each test used wordlists built (from the Oct-Dec-2002 training corpus)
using the same options being tested. All 8 combinations of the 3 options
were tested (with "def" denoting the default config with all 3 options
disabled). The contrib/randomtrain script was used to train-on-errors and
the same shuffled message order was used for the 8 tests.
The numbers are below. The "reg" columns give the number of messages that
bogofilter classified incorrectly. The randomtrain script then trains
bogofilter on each error so that it can do better for the next
message. For comparison purposes, I have included the earlier test results
at the end of each data line.
With all the numbers together, it's interesting to note that the good-reg
number is virtually the same as the ham-unsure (h-u) number. In contrast
the spam-reg value is 30-40 messages smaller than the spam-unsure
number. The implication is that train-on-error for the spam messages helps
a lot in identifying later spam, while the ham results aren't
affected. This may be caused by the many duplicated spams received by my
mail server.
David
02/13 10:37
spam reg good reg s-s s-h s-u h-s h-h h-u
def 1745 82 5044 123 1609 3 133 2 4918 124
asc 1745 83 5044 123 1608 3 134 2 4918 124
net 1745 81 5044 113 1604 5 136 2 4934 108
tag 1745 81 5044 125 1604 3 138 2 4918 124
asc-net 1745 83 5044 113 1602 4 139 2 4934 108
asc-tag 1745 82 5044 125 1603 3 139 2 4918 124
net-tag 1745 85 5044 114 1599 4 142 2 4928 114
net-tag-asc 1745 87 5044 114 1597 3 145 2 4928 114
classification parameters - all bogofilter's default values:
robs - 0.001
robx - 0.415
min_dev - 0.10
spam_cutoff - 0.95
ham_cutoff - 0.10
--------------------------------------------------------
David Relson Osage Software Systems, Inc.
relson at osagesoftware.com Ann Arbor, MI 48103
www.osagesoftware.com tel: 734.821.8800
More information about the Bogofilter
mailing list