Incorrigible spam

Boris 'pi' Piwinger 3.14 at logic.univie.ac.at
Tue Apr 13 14:07:05 CEST 2004


David Relson wrote:

>> Successively registering these hams and spams until they each score
>> correctly will polarize the difference while neutralizing the
>> intersection.  This is precisely what we would want to achieve.
> 
> pi has mentionned effects like that.  After a train-on-error pass,
> additional passes will show "errors" that weren't in the original pass. 
> Adding tokens to the wordlist does effect previous scores.  In most
> cases the effect is very, very small.

Yes, this is true. In general you observe that you get more
and more stable over time, i.e., after initial training (to
exhaustion) a correction will get you several errors. The
next time there are already a lot less. This goes so far
that at some point often there are no corrections needed. Of
course, sometimes a few messages come back, there is no
monotonicity;-)

I don't have or collect statistics on errors, I only
estimate that by the number of messages which come in during
two corrections and the numbers might be too small to be
significant, so take the following with some care:

I don't see that the stability described above improves the
error rate. Maybe slightly, but the system is very stable
already after say the first two or three corrections.

Anyhow, corrections can and do change values for other
messages, this can move messages over cutoffs.

pi




More information about the Bogofilter mailing list