Database maintenance with combined wordlist.

Greg McCann greg at cambria.com
Fri Sep 26 23:46:25 CEST 2003


Greetings,

I recently converted from using separate spamlist.db/goodlist.db wordlists to using the combined wordlist, wordlist.db.  (I am now running bogofilter 0.15.4.)

A useful feature of the separate wordlists is that I could run a daily cron job to prune old entries, with different expiration times for each list.  Here is my cron job - bogomaint.cron:

# remove records older than 30 days from spamlist.db
/usr/local/bin/bogoutil -a30 -m /home/bogofilter/spamlist.db
# remove records older than 60 days from goodlist.db
/usr/local/bin/bogoutil -a60 -m /home/bogofilter/goodlist.db

I do this to keep my wordlists fresh and to keep them from growing too large.  The reason I used different expiration times is that I have far more spam than ham in my wordlists.  I want to keep the wordlists roughly the same size, though even allowing ham to age twice as long as spam, the spam list was still twice as large as the ham list.

However with the combined wordlist, there does not seem to be any way to specify different expiration times for spam and ham.  Is there a feature I am not aware of that would allow me to do this?  If not, would it be possible/practical/reasonable to provide one?  In addition to the other selection criteria (-a, -c, -s), it would be nice if I could specify whether a maintenance action applied only to spam or only to ham (with "both" being the default).


Greg McCann







More information about the Bogofilter mailing list