switching between different databases - in 1.3.0.rc1

Matthias Andree matthias.andree at gmx.de
Thu May 22 23:53:01 CEST 2025


Am 22.05.25 um 21:51 schrieb Rob McEwen via bogofilter:
> I was finally able to download and successfully install 1.3.0.rc1 on a 
> testing Debian VM. And it worked!
>
> For this first attempt - I installed LMDB as the database.  But the 
> first thing that I found to be odd - is that my cursory testing of 
> performance (although for just 1-at-a-time reads, these were just 
> evaluations of emails, so - no writes) - that cursory testing showed 
> LMDB to be actually a little SLOWER than than my 1.2.5 setup using the 
> Berkeley database. That was a huge surprise! I had expected so much 
> more from LMDB. And this was using the same hundreds of MBs large data 
> files, and on similar hardware. Is that result - to be expected? Or 
> should LMDB have been much faster? Nevertheless, I've read that LMDB 
> has a larger performance advancement when dealing with many concurrent 
> connections - so maybe that's what I was missing?


Rob,

What does "slower" mean? How do you measure? What's the Debian version 
and what are the database versions? Do you compare Debian VM with 1.3.0 
vs. bogofilter 1.2.5 on a real machine without VM in between? What's the 
underlying filesystem for both? Configured how exactly?


> Anyway - so now I'm going to do more extensive performance testing 
> with 1.3.0.rc1, comparing LMDB and Berkley and Sqlite3 - comparing 
> those three to each other.


Looking forward to those results.


> Are there any instructions or suggestion for how to switch between 
> these three? Can I just re-run the "configure": and "make" commands? 
> Or do I need to uninstall first? Or something else?

No need for fussing. Just configure with a different database argument, 
then make, should be good. If that doesn't work, but configure 
succeeded, do "make clean" after configure before you "make".

Before you install, if you want to reuse the database, use bogoutil 
--dump with the old build, then "make install" the newly built stuff 
with the other database driver and bogofilter --load.   Note that 
unfortunately, we made - at the time - use the same .db suffix for 
BerkeleyDB and SQLite3, so if you're about to switch forth and back 
before these two, you'll need to delete or rename  the database file.

Note to be comparing in a fair manner, Bogofilter DB should be used in 
"transactional mode" so as to be made robust (recoverable) against 
crashes because the other databases you're looking at should do just 
that: transactions. SQLite3 certainly does so (and if the application or 
computer crashes, you may have wordlist.db-wal files on disk, the 
write-ahead log (the initials are w, a, l, hence -wal).


> And do I need to unload LMDB somehow first each time I switch away 
> from LMDB? Or does it just go away after deleting the underlying files?


I don't fully understand your question. If you're asking about the LMDB 
package, it can be deinstalled if you're not running a bogofilter 
version built against it. The -dev[el] packages can be deinstalled 
unless you intend to build something against them, but are not required 
at run time.
If you are instead asking about the database files in the bogofilter 
directory, deleting them suffices.

I have not benchmarked in ages because the results aren't so 
transferable to other workloads and given the little time I have, 
setting up and conducting a proper benchmark that covers enough ground 
so people can look up their use case and then decide is just more effort 
than we can muster.

Should you decide to do anything of profiling/performance metrics and 
you identify hot spots or I/O slowdowns somewhere, please share your 
findings.

Hope that helps.

BTW, the plan is to fix that dash or underscore bug in the 
configuration/documentation which means I need to look at most of 
bogofilter, and then do 1.3.0.rc2 with that fix. 
<https://gitlab.com/bogofilter/bogofilter/-/issues/15>


Cheers,
Matthias




More information about the bogofilter mailing list