new writeup re varying Robinson's s and the minimum deviation

Greg Louis glouis at dynamicro.on.ca
Mon Mar 31 13:42:14 CEST 2003


On 20030331 (Mon) at 1052:07 +0200, Boris 'pi' Piwinger wrote:
> Greg Louis wrote:
> 
> > The big experiment I had been wanting to do, long delayed by hardware
> > problems, has now been completed.  The report is at
> > http://www.bgl.nu/bogofilter/smindev.html
> 
> Could you post to the list your optimal setting via
> bogofilter -qv? Thanks
> 
> pi

I'm not using the settings from these experiments in production; I'm
using

robx        = 0.415000 (4.15e-01)
robs        = 0.000000 (3.20e-07)
min_dev     = 0.050000 (5.00e-02)
ham_cutoff  = 0.100000 (1.00e-01)
spam_cutoff = 0.980000 (9.80e-01)

block_on_subnets = no
tag_header_lines = yes
replace_nonascii_characters = no

I _do_not_ recommend the above settings to anybody.  They look bogus to
me, though they're working well in my environment.  After the
experiment runs finish that are in progress today, I may well change
them.  Then, if things look good in a week or two, I may be willing to
make some kind of recommendation.  Right now, my recommendation is
taken from conclusion number 1 at the above URL:

"Random choice of parameters like mindev and s, or choice based
on limited experience, or blind use of the defaults that come with the
bogofilter distribution, is not likely to give optimum discrimination
between spams and nonspams.  Tuning is required, and is likely to
be required again from time to time as bogofilter training
improves.  Fortunately, however, discrimination is quite good over
a wide range, so that the exact values chosen aren't crucial to the
success of the filter."

-- 
| G r e g  L o u i s          | gpg public key: finger     |
|   http://www.bgl.nu/~glouis |   glouis at consultronics.com |
| http://wecanstopspam.org in signatures fights junk email |




More information about the Bogofilter mailing list