min_dev

Tom Allison tallison at tacocat.net
Wed Jun 30 03:35:41 CEST 2004


David Relson wrote:
> On Tue, 29 Jun 2004 16:10:18 -0700
> .rp wrote:
> 
> 
>>Where there any responses to the question about the min_dev = .5
>>setting?
> 
> 
> Hi rp,
> 
> Nope.  No responses.  
> 
> Thinking about whay you said, my thought is that together "min_dev" and
> "0.5" create an exclusion interval around 0.5.  
> 
> Currently, bogofilter's default is:
> 
>    min_dev=0.375
> 
> A more generic behavior would be to replace that with:
> 
>    excl_min = 0.125 # minimal value excluded
>    excl_max = 0.875 # maximum value excluded
> 
> or:
> 
>    excl_ctr=0.500   # center of exclusion interval
>    excl_siz=0.375   # size of exclusion interval
> 
> Folks is this a good idea or a bad one?  Anybody want to test to see how
> this affects accuracy?
> 

So many degrees of freedom.
Imagine how long bogotune would take if it has to go through these 
variations!

Anecdotally, I find the most Unsure mail tends to have a small number of 
very high scoring spam-tokens (>0.9) and a large number of high scoring 
ham-tokens (<0.4).  The pattern is similar for both spam and ham and 
from 10-20 that I've considered carefully, I can't really decide what 
works best.

Currently:
min_dev     = 0.465
ham_cutoff  = 0.15
spam_cutoff = 0.51

And I haven't even started to play with these ESF variables.



More information about the Bogofilter mailing list