Maintaining a snappy bogofilter

David Relson relson at osagesoftware.com
Thu Apr 10 15:00:54 CEST 2003


At 08:41 AM 4/10/03, Chris Ditri wrote:

>Hello Everyone,
>
>I was wondering what people to do keep their goodlist and spamlist databases
>fast and trim.  Do they need to be rebuilt from time to time or somehow
>"defragged"?
>
>Any recommendations?
>
>Thanks!
>
>Chris

Chris,

My spamlist currently has 80,413 words and 11,306 messages and my goodlist 
has 235,043 words and 29,736 messages.  Performance seems fine and I don't 
do anything to keep it fast and trim.

If I _were_ to do something, I'd use the maintenance capabilities in 
bogoutil.  Two capabilities in particular come to mind.  The first is the 
ability to delete all hapaxes, i.e. words occurring only once in the 
corpus.  The second is the ability to delete all words older than a certain 
age.

The ability is there and I don't know at what point it becomes of value to 
use it.

David






More information about the Bogofilter mailing list