evaluating possible new options
Greg Louis
glouis at dynamicro.on.ca
Fri May 16 12:38:40 CEST 2003
On 20030516 (Fri) at 0951:48 +1000, michael at optusnet.com.au wrote:
> Greg Louis <glouis at dynamicro.on.ca> writes:
> > summary(aov(pc ~ fold + head + html + fold*head + fold*html +
> > + head*html + fold*head*html, data=parms))
> > Df Sum Sq Mean Sq F value Pr(>F)
> > fold 1 0.038226 0.038226 26.5486 0.0008716 ***
> > head 1 0.296242 0.296242 205.7430 5.448e-07 ***
> > html 1 0.002262 0.002262 1.5709 0.2454608
> > fold:head 1 0.061685 0.061685 42.8410 0.0001794 ***
> > fold:html 1 0.001369 0.001369 0.9504 0.3581594
> > head:html 1 0.000251 0.000251 0.1743 0.6872818
> > fold:head:html 1 0.000251 0.000251 0.1746 0.6870339
> > Residuals 8 0.011519 0.001440
>
> A run from my corpus of 84875 spam and 48079 hams. Method
> used was to randomly divide into 4 equal blocks, then
> in turn, use one block to train and the measure against
> that block and the other three.
We don't usually test the same messages as are used to train, in these
kinds of experiments; it complicates the analysis unless the results
are discarded. But it's interesting to see how big a difference it
makes!
>
> Default bogofilter 0.12.3 with subject tagging turned on:
> $ perl ./out-crunch out
> CONFIG : Mindev 0.100, RobX 0.415
> 0 against 0 --> false pos 0 false neg 1425
> 0 against 1 --> false pos 0 false neg 4049
> 0 against 2 --> false pos 0 false neg 3977
> 0 against 3 --> false pos 0 false neg 3863
> 1 against 0 --> false pos 0 false neg 3770
> 1 against 1 --> false pos 0 false neg 1468
> 1 against 2 --> false pos 0 false neg 3873
> 1 against 3 --> false pos 0 false neg 3812
> 2 against 0 --> false pos 0 false neg 3859
> 2 against 1 --> false pos 0 false neg 3977
> 2 against 2 --> false pos 0 false neg 1467
> 2 against 3 --> false pos 0 false neg 3829
> 3 against 0 --> false pos 0 false neg 3923
> 3 against 1 --> false pos 0 false neg 4026
> 3 against 2 --> false pos 0 false neg 4026
> 3 against 3 --> false pos 0 false neg 1505
>
> Then the same data with latest CVS bogofilter with -Puh
> flag. (i.e. turning off case folding).
>
> [root at genconf73 db]# perl ./out-crunch out.1
> CONFIG : Mindev 0.100, RobX 0.415
> 0 against 0 --> false pos 0 false neg 1172
> 0 against 1 --> false pos 0 false neg 3283
> 0 against 2 --> false pos 0 false neg 3196
> 0 against 3 --> false pos 0 false neg 3105
> 1 against 0 --> false pos 3 false neg 3123
> 1 against 1 --> false pos 0 false neg 1166
> 1 against 2 --> false pos 2 false neg 3175
> 1 against 3 --> false pos 1 false neg 3042
> 2 against 0 --> false pos 1 false neg 3204
> 2 against 1 --> false pos 0 false neg 3304
> 2 against 2 --> false pos 0 false neg 1189
> 2 against 3 --> false pos 0 false neg 3149
> 3 against 0 --> false pos 1 false neg 3191
> 3 against 1 --> false pos 2 false neg 3285
> 3 against 2 --> false pos 3 false neg 3282
> 3 against 3 --> false pos 0 false neg 1208
>
> As you can see, there's been a jump in false positives.
Adjusting the spam cutoff to eliminate those is the best way to get a
true comparison between the two runs. I can always get fewer false
negatives at the expense of more false positives, just by twiddling the
spam cutoff. Since enabling these parameters tends to skew the
distribution of message scores, one needs to eliminate that effect in
order to be sure one's seeing a real change in the error rate.
> The good news though is the huge drop in false negatives. This is an
> average drop from 15.6% to 12.7% of total spam volume (or a nearly 20%
> drop in the spam getting through).
Although fixing the fn count will reduce this, it's likely you'd still
see a 10-15% drop in the spam that passes; that would be in line with
all but one of the experiments David and I have done.
--
| G r e g L o u i s | gpg public key: finger |
| http://www.bgl.nu/~glouis | glouis at consultronics.com |
| http://wecanstopspam.org in signatures fights junk email |
More information about the Bogofilter
mailing list