Converting old wordlist.db to Berkeley format?

Matthias Andree matthias.andree at gmx.de
Fri Sep 2 20:57:33 CEST 2016


Am 02.09.2016 um 10:28 schrieb Geoff:
> I have been using bogofilter since 2003, and have a large wordlist.db (154Mb),
> that has been regularly compacted, using the pre-0.93 bogofilter instructions
> in the FAQ. Negligently I have stayed with an old self-compiled version of
> bogofilter which has been perfectly adequate for my needs.  One reason for that
> is that I vaguely recall (it was ages ago), having problems when the change to
> the Berkeley DB was made.
>
> For reasons with which I need not bore you, I now want to use the current
> (1.2.4) bogofilter from my Arch distro repo.  If I run ..
>
> bogoutil -d wordlist.db | bogoutil -l wordlist.db.new
>
> .. the command generates: 573 log files, a 115.3 Mb wordlist.db.new, __db.001,
> __db.002, __db.003, lockfile-d and lockfile-p 
>
> The resulting wordlist (renamed to wordlist.db, of course) does not, however,
> appear to be usable:
>
> db_verify: /home/me/.bogofilter/wordlist.db: No such file or directory
> BDB5105 Verification of /home/me/.bogofilter/wordlist.db failed.
>
> Attempts to use bogoutil -w fail with :
> Error accessing file or directory '/home/me/.bogofilter/wordlist.db'.
> error #2 - No such file or directory.
>
> .. even though the directory and file are certainly present.

Geoff,

that's an interesting failure mode. It seems that something's goofed
with the __db.* files, and that can confuse Berkeley DB. The Berkeley DB
library is normally robust, but if it is made to operate on __db.* files
from an older bogofilter version it can feel quite screwed up even if
the database is still fine.

The log files are usually fine though, and also precious; if they are of
an older incompatible format, Berkeley DB will just start writing to a
new log file and go on, and if you are sure no bogofilter/bogoutil
program is running at the time, deleting the __db.* files is safe.

I don't know exactly if it's worth several round trips of my asking
questions and your answering them just for analysis purposes so let's
keep it short; the key questions are:

- Is the bogofilter version (along with the necessary libdb) that you've
used all the years still available to you, from a backup?
- what was the old bogofilter version?
- what was the Berkeley DB version that bogofilter used to use?
- what is the current Berkeley DB version that bogofilter is attempting
to use.

The canonical way to bring your old wordlist.db to a new installation
would be - if the old versions are still available to you:

0. stop your mail system and cron jobs that could call bogofilter and
related programs
1. put the old database (along with log.* files if any) and
executables/libraries back in place. Assume that it were called
bogoutil_old (no need to rename it).
2. delete the __db.* files (they will be recreated as needed): rm __db.*
3. using the old code, try to dump the old data: bogoutil_old -d
~/.bogofilter/wordlist.db >~/.bogofilter/wordlist.txt     ## if that
complains, we need something else, see below.
4. rename the .bogofilter directory: mv .bogofilter .bogofilter.old
5. create a new one: mkdir .bogofilter
6. if necessary, use cp -p to copy old configuration file from the old
into the new directory (if you've had any inside, which isn't too
common, might be a DB_CONFIG)
7. install the new bogofilter 1.2.4 executables and the libdb
8. import the dumped data: bogoutil_new -l ~/.bogofilter/wordlist.db
<~/.bogofilter.old/wordlist.txt
9. check if everything works, if needed, restart your mail system.
Note that command line options and output have changed since the old
days of bogofilter 0.92.X - be sure to read the RELEASE.NOTES file.
10. if so, remove the .bogofilter.old directory.

The "something else", for instance, if you don't have backup copies of
the old bogofilter/bogoutil executables or the database, that you've
used all the time, is described in the new bogofilter version's
doc/README.db file, it shows several recovery methods in order of
ascending level of desperation in section 3.2.  If the database itself
is still intact.


Hope that helps.

Cheers,
Matthias



More information about the bogofilter mailing list