tuning.sh [was: bogofilter-0.13.6.3 - new current release]

Greg Louis glouis at dynamicro.on.ca
Fri Jun 20 17:04:12 CEST 2003


On 20030620 (Fri) at 1408:14 +0200, Boris 'pi' Piwinger wrote:
> David Relson wrote:
> 
> > The rule of thumb is that $target should be 0.1% to 0.3% of the test set 
> > size.
> 
> For me that would be more than 70 (using .2%). Or do you
> only count half the size since the other half is used to
> build the database? Anyhow this would still be way to big.
> 
YET AGAIN ONCE MORE ANOTHER (and last) TIME:

You need more false positives to run a parameter scan than you want to
have in production.  David is quite right: if you use tuning.sh you
should set the target somewhere between 0.1% and 0.3% of the total size
of your test files (r?.ns).  Your own results show that 12 is too low,
as you're getting some reports of more fp during the scan -- and any
record for which the fp count is not the target is invalid.

AFTER you have your s and mindev (and x and cache size and ham cutoff)
properly set, THEN you ADJUST THE SPAM CUTOFF till you have a balance
between fn and fp that you can live with.

-- 
| G r e g  L o u i s          | gpg public key: finger     |
|   http://www.bgl.nu/~glouis |   glouis at consultronics.com |
| http://wecanstopspam.org in signatures fights junk email |




More information about the Bogofilter mailing list