more spamitarium results

Tom Allison tallison at tacocat.net
Sat May 15 03:13:57 CEST 2004


Tom Allison wrote:
> Ran some more tests looking at the average score and not the attribute 
> counts of ham/spam/unsure evaluations.
> 
> The consistency between all the spamitarium tests that have parms -rad 
> seems to be related to an observation that the resulting email has a 
> diff of typically one or two words (I only saw one, but I'm easy).
> 
> Average - Average        Test       
> Corpus    Parms    0        1        2
> ham    none    0.0049788    0.0062384    0.0059236
>     radw    0.0065136    0.0071255    0.0084594
>     readw    0.0065136    0.0071255    0.0084594
>     sradw    0.0065136    0.0071255    0.0084594
>     sreadw    0.0065136    0.0071255    0.0084594
>     sw    0.0065162    0.0071285    0.0084625
> spam    none    0.9598745    0.9732174    0.9700802
>     radw    0.9556415    0.9707304    0.9707012
>     readw    0.9556415    0.9707304    0.9707012
>     sradw    0.9556416    0.9707307    0.9707013
>     sreadw    0.9556416    0.9707307    0.9707013
>     sw    0.9556426    0.9707308    0.9707037
> 
> 
> _______________________________________________
> Bogofilter mailing list
> Bogofilter at bogofilter.org
> http://www.bogofilter.org/mailman/listinfo/bogofilter
> 

I figured out after I ran these that I was running with the default 
min_dev=0.375

I've since re-run this with min_dev =0.10


Average - Score		Test		
Corpus	Parms	0	1	2
ham	none	0.00233790	0.00371610	0.00310700
	radw	0.00374480	0.00461440	0.00509060
	readw	0.00374480	0.00461440	0.00509060
	sradw	0.00374480	0.00461430	0.00509110
	sreadw	0.00374480	0.00461430	0.00509110
	sw	0.00374800	0.00461890	0.00509380
spam	none	0.96398650	0.97528450	0.97076220
	radw	0.95979440	0.97210260	0.97014640
	readw	0.95979440	0.97210260	0.97014640
	sradw	0.95979440	0.97210230	0.97014720
	sreadw	0.95979440	0.97210230	0.97014720
	sw	0.95979510	0.97210230	0.97014860




robx        = 0.500000  # (5.00e-01)
robs        = 0.100000  # (1.00e-01)
min_dev     = 0.100000  # (1.00e-01)
ham_cutoff  = 0.100000  # (1.00e-01)
spam_cutoff = 0.900000  # (9.00e-01)





More information about the Bogofilter mailing list