db maintenance "delete oldest least used tokens, but maintain count of x"

David Relson relson at osagesoftware.com
Wed Mar 17 13:19:45 CET 2004


On Wed, 17 Mar 2004 12:57:26 +0100
Matthias Andree wrote:

> Tom Allison <tallison at tacocat.net> writes:
> 
> > So the MSG_COUNT isn't decreased by any maintenance functions (-a -c
> > -n)?
> >
> > I assumed it was rebuilt if you reloaded the database from text. 
> > But I guess this isn't case on further inspection.
> >
> > I assume then for proper care and feeding of bogofilter, there
> > really isn't much reason (rather, requirement) to use any of the
> > maintenance utilities unless you have performance needs.
> >
> > Am I close?
> 
> Having looked (but not posted), it appears as though the MSG_COUNT was
> used to evaluate the individual spamicity of the token only, and hence
> dropping tokens (without changing their associated spam/ham counts)
> should be safe. This all providing that I haven't missed a reference
> to the MSG_COUNT.

Matthias,

That's correct. AFAICT removing unwanted tokens from the wordlist is OK.
It has the obvious effects - smaller wordlist and tokens becoming
unknown.  It doesn't affect the remaining tokens.  Of course if someone
delete the wrong tokens, the effects will be serious.

David




More information about the Bogofilter mailing list