db maintenance "delete oldest least used tokens, but maintain count of x"
matthias.andree at gmx.de
Thu Mar 4 19:51:58 EST 2004
On Thu, 04 Mar 2004, Chris Fortune wrote:
> Anybody have any ideas how to implement this db maintenance rule?:
> "delete oldest least used tokens, but maintain count of x data rows", x being 100,000.
> It would be good to keep balance of spammy : hammy tokens?
I wonder how all these radical DB maintenance functions are going to
adjust the .MSG_COUNT value at some reasonable value. People are
bothered about the ham:spam token ratio, but the more important
token count:mail count ratio isn't questioned.
Did I miss research that resulted in "you can twist .MSG_COUNT or the
data base all you will without adverse effects"? I believe not, and
that's the reason why I have never used maintenance functions like
I wonder if we should record the token frequency or timestamp tokens
after read, just for fun, so someone can put a size limit and implement
some LRU strategy. Needless to say it would be an awfully slow
Encrypt your mail: my GnuPG key ID is 0x052E7D95
More information about the Bogofilter