Maintenance Best Practices

Dave Stubbs dstubbs at penguin.8inchfloppy.com
Thu Jun 5 18:05:57 CEST 2003


Hello,

I'm wondering what others are doing out there to maintain their database
files.  I receive between 100 and 200 emails per day, and send all emails
through bogofilter with -f -p -u -l -v -e options.

I notice that my goodlist.db file grows by megabytes every day.  For
instance, I blew it all away this morning and retrained with my collection
of mails.  I have 17,218 "hams" and 3600 "spams".

When finished, goodlist.db was about 6 megs.

Two hours later, it is already up to 13 megs from just processing the emails
that have come in since then.

At this rate, my goodlist.db hits 51megabytes in about 4 days and bogofilter
stops working.

So, every night I run bogoutil to drop all tokens larger than 16 characters,
and then do something like this:

bogoutil -d goodlist.db | bogoutil -l work.db

and then I wipe goodlist.db and move work.db to goodlist.db

Even with this, the goodlist.db file gradually grows.  At this rate, I've
bought enough time that I only have to blow it all away and re-train once a
month.

Anyone else wish to share their maintenance plans?

Or anyone else having problems when the database files hit 51megabytes?

Thanks,

Dave...





More information about the Bogofilter mailing list