Excessive memory usage: bug?

David Relson relson at osagesoftware.com
Fri Mar 11 00:55:51 CET 2005


On Thu, 10 Mar 2005 13:07:37 GMT
JUANVAQUEROPONC wrote:

> Excessive memory usage: bug?
> 
> I did:
> bogofilter -s < spam.mbox
> (with a big spam.mbox file of 300MB)
> 
> The memory usage is so huge that Linux kills the program or I need to
> kill it to continue using the machine.
> 
> Why bogofilter uses so much memory?
> Is it a memory leak?
> Is it a bug?
> 
> I'm using Debian unstable bogofilter_0.94.0-1 on x86

H'lo Juan,

When registering a mailbox (like you're doing), bogofilter does the
following:

  1. create a master wordlist
  2. read one message
  3. convert it to a list of tokens
  4. merge the new tokens with the master list
  5. repeat steps 2-4 for all messages
  6. update the database with the tokens of the master wordlist

The above technique uses a fair amount of ram but minimizes the disk
access for reading and writing the database.

The memory needed to process your 300MB mailbox depends on how many
unique tokens are in the mailbox.  Bogofilter tries to use memory
efficiently but (as explained above), registration increases memory use
to decrease disk I/O.  AFAIK, there aren't any memory leaks that would
make processing a 300mb file impossible (assuming enough ram on your
machine).

HTH,

David


_______________________________________________
Bogofilter mailing list
Bogofilter at bogofilter.org
http://www.bogofilter.org/mailman/listinfo/bogofilter



More information about the Bogofilter mailing list