more spamitarium results
Tom Allison
tallison at tacocat.net
Sat May 15 03:13:57 CEST 2004
Tom Allison wrote:
> Ran some more tests looking at the average score and not the attribute
> counts of ham/spam/unsure evaluations.
>
> The consistency between all the spamitarium tests that have parms -rad
> seems to be related to an observation that the resulting email has a
> diff of typically one or two words (I only saw one, but I'm easy).
>
> Average - Average Test
> Corpus Parms 0 1 2
> ham none 0.0049788 0.0062384 0.0059236
> radw 0.0065136 0.0071255 0.0084594
> readw 0.0065136 0.0071255 0.0084594
> sradw 0.0065136 0.0071255 0.0084594
> sreadw 0.0065136 0.0071255 0.0084594
> sw 0.0065162 0.0071285 0.0084625
> spam none 0.9598745 0.9732174 0.9700802
> radw 0.9556415 0.9707304 0.9707012
> readw 0.9556415 0.9707304 0.9707012
> sradw 0.9556416 0.9707307 0.9707013
> sreadw 0.9556416 0.9707307 0.9707013
> sw 0.9556426 0.9707308 0.9707037
>
>
> _______________________________________________
> Bogofilter mailing list
> Bogofilter at bogofilter.org
> http://www.bogofilter.org/mailman/listinfo/bogofilter
>
I figured out after I ran these that I was running with the default
min_dev=0.375
I've since re-run this with min_dev =0.10
Average - Score Test
Corpus Parms 0 1 2
ham none 0.00233790 0.00371610 0.00310700
radw 0.00374480 0.00461440 0.00509060
readw 0.00374480 0.00461440 0.00509060
sradw 0.00374480 0.00461430 0.00509110
sreadw 0.00374480 0.00461430 0.00509110
sw 0.00374800 0.00461890 0.00509380
spam none 0.96398650 0.97528450 0.97076220
radw 0.95979440 0.97210260 0.97014640
readw 0.95979440 0.97210260 0.97014640
sradw 0.95979440 0.97210230 0.97014720
sreadw 0.95979440 0.97210230 0.97014720
sw 0.95979510 0.97210230 0.97014860
robx = 0.500000 # (5.00e-01)
robs = 0.100000 # (1.00e-01)
min_dev = 0.100000 # (1.00e-01)
ham_cutoff = 0.100000 # (1.00e-01)
spam_cutoff = 0.900000 # (9.00e-01)
More information about the Bogofilter
mailing list