testing questions

Mon Mar 8 05:28:56 CET 2010

> Assuming the OP's question was an academic exercise, I think that:
>
>    bogofilter -v -m 0.51,1.0,0.5
>
> will score every message at 0.500000 regardless of any training (because
> no tokens would be used since min_dev would be impossible to satisfy).
>
> If so, then:
>
>    bogofilter -v -m 0.51,1.0,0.5 -o .51,0
>
> should always classify as ham, and:
>
>    bogofilter -v -m 0.51,1.0,0.5 -o 0.5,0
>
> should always classify as spam. There's some leeway with the numbers,
> but I _think_ those should work. None of this would be useful for
> actually scoring emails, of course, but it's sort of interesting.
>
I found a machine that I could test with, so here is what was found.

The "ham classifier" settings worked as expected, but needed to change  
the "spam classifier" settings to:

robx        = 0.510000
robs        = 1.000000
min_dev     = 0.500000
ham_cutoff  = 0.000000
spam_cutoff = 0.510000

It's likely that Matt had intended this all along, but forgot to  
twiddle the -o values after copying and pasting his "ham classifier"  
settings.

Thanks for the advice.

Mike B.

----------------------------------------------------------------
This message was sent using IMP, the Internet Messaging Program.