db maintenance "delete oldest least used tokens, but maintain count of x"

Boris 'pi' Piwinger 3.14 at logic.univie.ac.at
Wed Mar 17 13:23:02 CET 2004


David Relson wrote:

>> Having looked (but not posted), it appears as though the MSG_COUNT was
>> used to evaluate the individual spamicity of the token only, and hence
>> dropping tokens (without changing their associated spam/ham counts)
>> should be safe. This all providing that I haven't missed a reference
>> to the MSG_COUNT.
> 
> That's correct. AFAICT removing unwanted tokens from the wordlist is OK.
> It has the obvious effects - smaller wordlist and tokens becoming
> unknown.  It doesn't affect the remaining tokens.  Of course if someone
> delete the wrong tokens, the effects will be serious.

The danger seems to be that once the tokens return, there
values will be horribly wrong.

pi




More information about the Bogofilter mailing list