Erase old tokens ?

David Relson relson at osagesoftware.com
Thu Apr 20 12:19:00 CEST 2006


On Thu, 20 Apr 2006 11:36:47 +0200
Belette wrote:

> Hello ?
> I would like to know something..
> 
> when i use bogotil -d wordlist.db, i got :
> 
> XXXX 3 4 20063101
> 
> 
> I guess XXXX is the token, 20063101 is the last day i saw this token, but
> what does 3 and 4 mean ?
> 
> The number of times this token has been viewed ? and 4 is the score ? i
> cannot find any help :(
> 
> The aim of this question is to "clean" the wordlist.db, since the size is
> about 500meg..:(
> 
> i would like to erase token which are unused.
> 
> any idea to do that kinda script ?
> 
> thx

Hello Belette,

This isn't really a bogofilter-dev question.  It's more a usage
question and belongs on the bogofilter list.  That aside,

The 3 numbers are the spam count, ham count, and date last modified.

Bogofilter only changes the database when you register a message, so
dates only show changes.  If bogofilter updated the database whenever
you scored a message, your hard drive would be much busier than
is necessary.

Compacting your wordlist might be all you need to do.  Script
bf_compact can do that for you or you can do it with:

   bogoutil -d wordlist.db | bogoutil -l wordlist.db.new
   mv wordlist.db wordlist.db.old
   mv wordlist.db.new wordlist.db

This trio of commands assumes you're _not_ using transactions.  If
you're using transactions, the new wordlist needs to be in a new
directory (so the log files have a home).  It's easier to use
bf_compact.

Home this helps!

David



More information about the bogofilter-dev mailing list