Maintenance Best Practices
Dave Stubbs
dstubbs at penguin.8inchfloppy.com
Thu Jun 5 18:05:57 CEST 2003
Hello,
I'm wondering what others are doing out there to maintain their database
files. I receive between 100 and 200 emails per day, and send all emails
through bogofilter with -f -p -u -l -v -e options.
I notice that my goodlist.db file grows by megabytes every day. For
instance, I blew it all away this morning and retrained with my collection
of mails. I have 17,218 "hams" and 3600 "spams".
When finished, goodlist.db was about 6 megs.
Two hours later, it is already up to 13 megs from just processing the emails
that have come in since then.
At this rate, my goodlist.db hits 51megabytes in about 4 days and bogofilter
stops working.
So, every night I run bogoutil to drop all tokens larger than 16 characters,
and then do something like this:
bogoutil -d goodlist.db | bogoutil -l work.db
and then I wipe goodlist.db and move work.db to goodlist.db
Even with this, the goodlist.db file gradually grows. At this rate, I've
bought enough time that I only have to blow it all away and re-train once a
month.
Anyone else wish to share their maintenance plans?
Or anyone else having problems when the database files hit 51megabytes?
Thanks,
Dave...
More information about the Bogofilter
mailing list