[bogofilter] Improved Calculations
Tom Allison
tallison at tacocat.net
Thu May 13 22:29:04 CEST 2004
David Relson wrote:
>>Right. But "run test" can be extremly expensive;-)
>
>
> Hi pi,
>
> Indeed. Bogotune's additional scanning (5 sp_esf values and 5 ns_esf
> values) has increased its workload by 25. However if you want
> demonstrably useful numbers, you've got to do the work. Think of it as
> a task to keep your computer out of trouble while you're sleeping.
>
> David
> _______________________________________________
> Bogofilter mailing list
> Bogofilter at bogofilter.org
> http://www.bogofilter.org/mailman/listinfo/bogofilter
>
Like "bogo at home" ?? :)
I tried something which was statistically illegal but provided the same
results in much less time when I was playing with
robx/robs/min_dev/subnet/ascii_char values. For what it's worth, I'll
share, but caveat with this was probably a bad thing to do and I got
lucky and my statistics professor will hunt me down for posting this.
Keeping all variables fixed, I modified only one and found the best value.
Using that best, I then varied another variable and found the best for
that one.
And so on.
I started with robx=(1.0, 0.1, 0.01) and chose 1.0
then did block_on_subnets and replace_nonascii_chars and picked yes for
both.
Finally I picked robx (0.4, 0.5 0.6) and picked on 0.6.
This gave the same results as if I went through and selected the best
set of tests results for all three sample sets across all 24 parameter
combinations. I ended up at the same place in much less time.
Statistically this could be highly illegal, unless these variables are
independent of each other...
If I somehow wasn't just lucky, then the combination of tests could be
additive and not multiplicative. (2 sets of 5 tests would increase by
+10 and not *25 )
More information about the Bogofilter
mailing list