db maintenance "delete oldest least used tokens, but maintain count of x"

David Relson relson at osagesoftware.com
Fri Mar 5 16:48:47 CET 2004


On Fri, 05 Mar 2004 16:28:13 +0100
Matthias Andree wrote:

> David Relson <relson at osagesoftware.com> writes:
> 
> > I'm not sure what to do about .MSG_COUNT either.  My educated guess
> > is"don't worry.  do nothing."
> 
> Won't work in the long run. One half of your tokens have expired,
> .MSG_COUNT is way too large.

Is it?  Or does it reflect shorter messages.  When "xyz" expires, it's
comparable to saying that the message was 1 token shorter than
bogofilter originally thought it to be.

As a guess, tokens will expire more or less evenly from ham and spam.
The database will, in some sense, resemble swiss cheese (lots of
substance and lots of holes).  Since the holes represent unused/unwanted
tokens, the cheese is lighter in weight, but is no smaller in outer
dimensions.

Many tokens will continue to exist.  The counts for "relson" and
"osagesoftware.com" will not be affected by age or date pruning, hence
the number of messages isn't affected.




More information about the Bogofilter mailing list