bogoupgrade

David Relson relson at osagesoftware.com
Fri Aug 22 19:10:03 CEST 2003


On Fri, 22 Aug 2003 13:37:02 -0300
Rodrigo Bernardo Pimentel <rbp at isnomore.net> wrote:

...[snip]...

>         BTW, why are bogofilter databases so often corrupted? For a time,
> everytime I upgraded bogofilter (which was *not* on every bogofilter
> release) I trained it again and stored the previous db files. When I saw a
> previous post about corrupt databases I went to check mine and *all* of them
> (4, actually) were corrupted. Ok, so I upgraded to 0.14.3-1 (Debian) and
> retrained it. That was about a week ago (9 days, I think). And yesterday I
> noticed wordlist.db is corrupt again!
> 
> rbp at francesca:~/.bogofilter$ db4.1_verify wordlist.db
> db_verify: Page 1707: out-of-order key at entry 59
> db_verify: DB->verify: wordlist.db.bak: DB_VERIFY_BAD: Database verification failed
> 
>         Now, this shouldn't be happening so often, right? At the very least,
> we should be getting some sort of warning from bogofilter ("Warning: corrupt
> database, please retrain" or something).
> 
>         And what causes corruption? A message bogofilter doesn't understand?
> 
>         I'm implementing what David Relson sugested (cronjob to backup and
> verify daily), but that's just a workaround, we should try and understand
> what's causing this.

Rodrigo,

You ask some good questions, and I wish I knew the answer.  I'd like to see the problem identified!!

At one time the suspected cause was incorrect use of BerkelyDB's locking mechanism by bogofilter.  Matthias did a lot of work on that and fixed it - so far as we know.  As your results show, there's still some kind of a problem.

What's your environment - distribution, kernel, BerkeleyDB version?  Is there anything you're aware of that may be different about your setup?

I just checked my 7 daily wordlist.db backups and db_verify thinks they're all fine.  Of course, the fact that it's working for me doesn't mean a whole lot.  FWIW, my wordlist.db is approx 44MB and has about 780,000 tokens in it. 

We'd appreciate any info that you think might help!

Peace,

David




More information about the Bogofilter mailing list