bogoutil (performance ?)
David Relson
relson at osagesoftware.com
Wed May 28 17:17:10 CEST 2003
At 10:40 AM 5/28/03, T'aZ wrote:
>#db_verify goodlist.db
>db_verify: Out-of-order key, page 1449 item 79
>db_verify: Last item on page 694 sorted greater than parent entry
>db_verify: Last item on page 1010 sorted greater than parent entry
>db_verify: First item on page 694 sorted greater than parent entry
>db_verify: Page 694 linked twice
>db_verify: DB->verify: goodlist.db: DB_VERIFY_BAD: Database verification
>failed
>
> > Can you dump the wordlist, i.e. "bogoutil -d goodlist.db". If that
> > works, the database is ok.
>
>ugh , scrolled correctly until words beginning with e , then restarted
>from letters b, rescrolling again , then restarting at b etc etc etc
>
>:( seems b0rked :'(
>
>iirc my first version was 0.11.something
You've got a broken database. We'll probably never know why. The locking
problems were fixed AFAIK in 0.10. How large a quantity of email do you
deal with? Which version of BerkeleyDB are you running?
Anyhow, you can try to recover data using db_dump or, if you have saved ham
and spam, you can start over and train bogofilter with what you have saved.
A possible precaution would be to snapshot your wordlists
periodically. For example, on Sundays "cp -a $BOGOFILTER_DIR save.1" and
on Mondays "cp -a $BOGOFILTER_DIR save.2" etc. As part of the cron job,
check word counts, e.g. "bogoutil -d list.db | wc -l".
You're the first person to report database corruption in a long
time. Hopefully it's a fluke and doesn't happen again.
David
More information about the Bogofilter
mailing list