Converting old wordlist.db to Berkeley format?

Matthias Andree matthias.andree at gmx.de
Mon Sep 5 01:19:44 CEST 2016


Am 03.09.2016 um 13:52 schrieb Geoff:

> Following all your advice, I did export a wordlist.txt from my old machine /
> tools and import it to the new system. The result (over several tries and
> variations), was that db_verify would initially report that the database was
> valid, but after bogofilter had operated on just one new test email (including
> an imaginary word so that I could trace it).  bogofiler -w would find the word,
> correctly classified, but db_verify would now report:
>
> db_verify ~/.bogofilter/wordlist.db
> db_verify: BDB2506 file unknown has LSN 576/24366, past end of log at 1/28
> db_verify: BDB2507 Commonly caused by moving a database from one database
> environment db_verify: BDB2508 to another without clearing the database LSNs,
> or by removing all of db_verify: BDB2509 the log files from a database
> environment db_verify: BDB0522 Page 0: metadata page corrupted
> db_verify: BDB0523 Page 0: could not check metadata page
> db_verify: /home/me/.bogofilter/wordlist.db: BDB0090 DB_VERIFY_BAD:
> Database verification failed BDB5105 Verification
> of /home/me/.bogofilter/wordlist.db failed.

This pretty much looks like multiple non-matching instances of Berkeley
DB and/or bogofilter on the computer (try "which -a bogofilter"), and in
particular bogofilter not using the same Berkeley DB version - as though
an old and a new version were accessing the same database. bogofilter -V
or bogoutil -V will report its own version, and the Berkeley DB version
they are using. Be sure to use the same Berkeley DB version (library,
utilities) consistently when you use the db_* tools and bogofilter.
> db_verify was a bit of a nuisance in that it demanded more mutex space than was
> allocated by the system, and sometimes would not run at all for that reason.
> Google shows that this is an old issue and (apparently) unrelated to the
> problem I was trying to solve.
I can reproduce one part of the problem, and explain it.

The bogoutil -l created log files, the wordlist.db.new file, and "the
environment" in the __db.00* (001 and counting) files.
In this environment, Berkeley DB stores the filename, "wordlist.db.new".

Now you've renamed the file to "wordlist.db".
But the __db.* files still point to the "wordlist.db.new" file - which
is no longer there.  That causes db_verify to fail.

There are several remedies, bogoutil --db-recover, bogoutil
--db-remove-environment, even bogoutil --db-verify - after running
either, then db_verify ~/.bogofilter/wordlist.db also succeeds.
db_verify fails between the rename and using bogofilter or bogoutil or
running db_recover. In the end they all ditch the stale __db.* files,
which resolves the situation because the database itself is not corrupted.

BTW, you can get rid of log files that are no longer needed by:

    bogoutil --db-prune=$HOME/.bogofilter

Remember: If you copy databases that use transactions, you need to copy
all the log.* files.

Note: /not/ using transactions - which is your current setup - means
that *ANY* crash of computer or bogofilter software can corrupt the
database. The transaction stuff is meant to make it crashproof.




More information about the bogofilter mailing list