Bogofilter migration & tuneup

Robin Bowes robin-lists at robinbowes.com
Mon Dec 5 15:31:48 CET 2005


Matthias Andree said the following on 05/12/2005 13:57:
> Robin Bowes <robin-lists at robinbowes.com> writes:
>>The thing is, wordlist.txt is currently around 4.7GB in size and
>>growing! The original wordlist.db is 105MB.
>>
>>How can I reduce the size of the wordlist?
> 
> 
> The wordlist.db file is likely corrupt and looping. If you have log
> files for this wordlist.db, then running
> 
>     bogoutil.0.93 --db-recover=/path/to/.bogofilter.bak
> 
> should fix this.
> 
> If it does not, retry with --db-recover-harder or see doc/README.db for
> other recovery strategies.

Hmmm. So running bogoutil with --db-prune was not a good idea then? :(

I ran bogoutil with --db-recover anyway, and then with --db-recover-harder.

I'm now trying bogoutil -d wordlist.db > wordlist.txt again to see if it
grows massively again.

> My apologies if the option is actually named differently, I haven't
> looked at 0.93.5 in a while.

Me neither - it's just been sat there working for me!

>>One last thing, on the old machine, the .bogofilter directory "filled
>>up" with loads of DB log files. I'm not really interested in keeping all
>>of them. Is the correct way to keep these in check to use a cron task
>>running "bogoutil --db-prune" ?
> 
> 
> That would work with older bogofilter versions, you don't need this
> after the upgrade though: 1.0.0 removes logs files automatically if they
> are no longer of use (but can be configured to leave these behind if so
> desired).

Ah, OK. That's a definite improvement.

> David,
> 
> should we run the verify method by default before dumping, and if verify
> fails, either try recovery (on TXN) or request the user to use db_dump
> instead (on traditional)?  This might be one non-bugfix item I'd be
> willing to let into 1.0 as it improves robustness when users upgrade
> from 1.0.X to 1.1 later.---Not that 1.1 were in sight though. :-)

Also, you should perhaps run the verify method before the --db-prune
option is run.

Thanks,

R.




More information about the Bogofilter mailing list