tuning.sh [was: bogofilter-0.13.6.3 - new current release]
Boris 'pi' Piwinger
3.14 at logic.univie.ac.at
Fri Jun 20 17:48:56 CEST 2003
Greg Louis <glouis at dynamicro.on.ca> wrote:
>> > The rule of thumb is that $target should be 0.1% to 0.3% of the test set
>> > size.
>>
>> For me that would be more than 70 (using .2%). Or do you
>> only count half the size since the other half is used to
>> build the database? Anyhow this would still be way to big.
>>
>YET AGAIN ONCE MORE ANOTHER (and last) TIME:
>
>You need more false positives to run a parameter scan than you want to
>have in production.
I recall you said it should lead to something where the test
finds a cutoff not to close to .5 or 1. That failed for 12
or 24. For 3:
robs min_dev spam_cutoff run0 run1 run2 total
0.1000 0.450 0.501000 60 62 64 186
0.0320 0.425 0.524000 86 97 81 264
0.0320 0.450 0.682000 100 93 88 281
0.0100 0.425 0.515000 96 101 87 284
0.1000 0.425 0.573000 91 102 91 284
0.3200 0.450 0.625000 98 112 100 310
0.0100 0.450 0.686000 111 108 103 322
0.0100 0.400 0.515000 112 123 111 346
0.0320 0.400 0.535000 115 125 113 353
0.0100 0.375 0.521000 119 126 125 370
>David is quite right: if you use tuning.sh you
>should set the target somewhere between 0.1% and 0.3% of the total size
>of your test files (r?.ns).
I see, of a single such file! That gives 7.5 for 0.2%.
>Your own results show that 12 is too low,
Contradicting the above.
pi
More information about the Bogofilter
mailing list