min_dev
Tom Allison
tallison at tacocat.net
Wed Jun 30 03:35:41 CEST 2004
David Relson wrote:
> On Tue, 29 Jun 2004 16:10:18 -0700
> .rp wrote:
>
>
>>Where there any responses to the question about the min_dev = .5
>>setting?
>
>
> Hi rp,
>
> Nope. No responses.
>
> Thinking about whay you said, my thought is that together "min_dev" and
> "0.5" create an exclusion interval around 0.5.
>
> Currently, bogofilter's default is:
>
> min_dev=0.375
>
> A more generic behavior would be to replace that with:
>
> excl_min = 0.125 # minimal value excluded
> excl_max = 0.875 # maximum value excluded
>
> or:
>
> excl_ctr=0.500 # center of exclusion interval
> excl_siz=0.375 # size of exclusion interval
>
> Folks is this a good idea or a bad one? Anybody want to test to see how
> this affects accuracy?
>
So many degrees of freedom.
Imagine how long bogotune would take if it has to go through these
variations!
Anecdotally, I find the most Unsure mail tends to have a small number of
very high scoring spam-tokens (>0.9) and a large number of high scoring
ham-tokens (<0.4). The pattern is similar for both spam and ham and
from 10-20 that I've considered carefully, I can't really decide what
works best.
Currently:
min_dev = 0.465
ham_cutoff = 0.15
spam_cutoff = 0.51
And I haven't even started to play with these ESF variables.
More information about the Bogofilter
mailing list