Word list size question

David Relson relson at osagesoftware.com
Sun Oct 31 20:22:30 CET 2010


Hello Doug,

The size of the wordlist database varies with the amount of training
you do.  120MB doesn't seem particularly large.  I use Berkeley DB,
rather than SQLite and my wordlist is 3x larger than that.

Dumping and reloading a wordlist will compact it -- at least for a
while.  With Berkely DB as tokens are added, the DB will (as
needed) split a database block into 2 blocks to provide room for the
new token.

Dump/reload is done with bogoutil and bogofilter's source distribution
includes a script a bf_compact for this purpose.  Look to see if
your distro includes the script.  As I don't use sqlite, I can't
predict how much space will be recovered by running the script.

HTH,

David

On Sat, 30 Oct 2010 22:28:12 -0700 (PDT)
Doug wrote:

> I have been running bogofilter for about 2 weeks and my wordlist size
> is now almost 120M
> 
> 117240 -rw-rw----    1 admin    admin    119927808 Oct 31 01:04
> wordlist.db
> 
> at what rate can I expect this to grow? Does it just continually
> grow? If so I suppose at some point it becomes unusable?
> 
> I am using Sqlite.  Is there a way to compact it using Sqlite? 



More information about the Bogofilter mailing list