Filters That Fight Back

Jason Rennie jrennie at ai.mit.edu
Wed Sep 3 14:43:41 CEST 2003


jef at acme.com said:
>  Eventually procmail will start to time-out and I get a big mess in my
> inbox.  So, instead I use xargs and -B so that the mass registration
> gets broken up into batches and incoming mail gets a chance to run.

Might be worth building the new database somewhere else (e.g.
/tmp/.bogofilter) and overwriting the old database once the new one is 
finished.  Though, if you have no control over when e-mail is 
incorporated this would be a bit tricky since you'd need to obtain a lock 
on the old database before overwriting it...

relson at osagesoftware.com said:
> The problem with registering lots of messages from an MH or Maildir
> was that bogofilter updated the wordlist for each input file.  That
> was slow.  When you test the 0.15.0 code I think you'll find that it's
> comparably fast for mboxs and Maildirs.

I can vouch for this statement.  Training with pre-0.15 took me hours. Now
it takes a minute or two.  I have about 2000 spam and 8000 ham in MH
folders.

Here's a script that automates the process of bogofilter training for MH
folders (requires the soon-to-be-released 0.15.1 :)

  http://www.ai.mit.edu/~jrennie/mail/bogoTrain-0.2.perl

Jason






More information about the Bogofilter mailing list