New version, database incompatible -retrain again!
Adrian
adrian at aeolian.org.uk
Sun Feb 9 16:25:52 CET 2025
On Sun, 09 Feb 2025 20:01:24 +0900
Masaru Nomiya <nomiya at lake.dti.ne.jp> wrote:
> Hello,
>
> In the Message;
>
> Subject : New version, database incompatible -retrain again!
> Message-ID : <20250209094909.5fe40664.adrian at aeolian.org.uk>
> Date & Time: Sun, 9 Feb 2025 09:49:09 +0000
>
> Adrian via bogofilter <bogofilter at bogofilter.org> has written:
>
> > I use bogofilter with Claws Mail. When I install a new version of
> > Claws and presumably bogofilter/Berkeley DB is upgraded as a result, the
> > database (wordlist.db) becomes incompatible and I'm asked to start
> > retraining from scratch.
>
> > If I thought about it in advance, I could dump the wordlist to text and
> > reload it into the new database, if that would work. If I don't
> > remember, all I can do is keep a few thousand spams somewhere as a
> > training corpus.
>
> > Is there a simple solution?
>
> I don't know if it's simple or not, but this is how I do it.
>
> $ bogoutil -d wordlist.db | bogoutil -l wordlist.db.new
> $ mv wordlist.db wordlist.db.prev
> $ cp wordlist.db.new wordlist.db
>
> Test
>
> $ bogofilter --spam-cutoff 0.9 -k 10 -v < ~/var/Mail/inbox/10
>
> X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.3.0.beta1
>
> Best Regards & Good Night.
Many thanks!
This agrees with what I thought. Simple enough, but only if
you remember to do it before an upgrade, while your bogoutils version
still agrees with your db version!
Obviously the ultimate cause lies with Oracle Berkeley DB, which seems
to provide no migration method across version changes.
I'll boot an old Linux Flash install and install bogofilter on it. That
may pull in a compatible version. But I'm not holding my breath.
I plan to create a job to dump the latest wordlist.db to text
once a month. So even if I have to retrain from scratch this time, I'll
always have a reasonable source to start from in future.
UPDATE
Since starting this reply, I've had another look. The wordlist.db is
an Sqlite3 database! It has a single table and rows have a key and a
blob. It might be fairly easy to save the blobs and unpack them into the
string format of a text dump. I'll try a few things and report back.
More information about the bogofilter
mailing list