DB backend support for lmdb?

Steffen Nurpmeso steffen at sdaoden.eu
Sat May 26 23:43:35 CEST 2018


Hello Matthias.

Matthias Andree <matthias.andree at gmx.de> wrote:
 |Am 23.05.2018 um 00:07 schrieb Steffen Nurpmeso:
 |> Since some years i am using bogofilter (thanks!), on FreeBSD, Arch
 |> Linux and now AlpineLinux.  The latter links again sqlite, and
 |> this is a real painful experience: very large database, slower by
 |> factors, see [1] for my request to change DB backend.
 |>
 |> The project leader of AlpineLinux notes correctly that Oracle
 |> changed the DB license, however, so i had to compile my own
 |> bogofilter using DB -- DB is used by Postfix on AlpineLinux, and
 |> smaller und faster is always better, if possible.
 |>
 |> Long story short: Postfix alternatively supports lmdb [2], and
 |> i wonder whether bogofilter support for this could be added, too?
 |> Looking at the Postfix code it seems pretty easy, the properties
 |> of lmdb also sound very promising for the purpose of Postfix and
 |> bogofilter.  What do you think, i mean, i have seen you have added
 |> KyotoCabinet in the R-C-S, but the possibility to use the same DB
 |> everwhere is very desirable i think?
 |>
 |>   [1] https://bugs.alpinelinux.org/issues/8860
 |>   [2] https://github.com/LMDB/lmdb
 |
 |Steffen,
 |
 |While I do not see the immediate need for LMDB support (Berkeley DB 5
 |remains available if you cannot use an AGPL'd DB 6), or why it is
 |desirable that independent applications use the same database library.
 |That said, it might be possible to add LMDB support, from a first
 |cursory glance over lmdb.h.

Good to know!  I have nothing against many, and many kinds of
databases, but i personally lack completely time and also interest
to track all these different sources of information.  Regarding
DB: you need to register to be able to access the download area
for >5, which is ok per se, but then again: not (for me).  No no.

And then lmdb is very small and compiles quick, which i really
like.  And am not alone here!  Being able to drive the entire Mail
system with a single small support library is very desirable to
me indeed.

 |If you have figures to prove that SQlite is slower by factors (beyond
 |3), I'd like to see them; other than that we've had R. David Hipp look
 |over our implementation and it's optimized quite a bit already -
 |although using an SQL database for a key-value store is an oversized
 |approach, so some price has to be paid.

Really?  Well, it really is, and a lot; even after manually
applying VACUUM it is twice as large, but worse is that actual
spam checking is factor two _and_more_ slower than DB, even if
bogofilter is compiled -O1 -g.  Maybe or even surely that also has
to do with fsync(2) or whatever, i do not know, but that does not
change a thing.  I am doing `spamrate' (which actually is
"bogofilter -TTu 2>/dev/null") and i would not expect that to come
into play.  Yet sqlite DB file times change(d) even after such
a command.  They do not for DB.  I do not know.

I personally have nothing at all against sqlite, quite the
opposite is true, i have used it in the past explicitly by my own
decision; it is just that for my spam wordlist the performance
through bogofilter is extremely annoying.

 |I would however require that LMDB support provides ACID compliance.
 |(Note that I am arguing in Berkeley DB terms here, without acquaintance
 |with LMDB, hence the should.)

I have no idea but through the LMDB documentation either.  From
that it should satisfy all needs you desire.  (I just searched for
a common denominator in between what i use, and if that would
finally be LMDB i thought, that would be nice.)  It is being
developed by and for OpenLDAP, which is being used quite
enterprisish...

 |We have a database abstraction layer in place, bogofilter/src/datastore*
 |are the necessary files, and datastore.h is the central interface, which
 |needs to be implemented by some to-be-written set of datastore_lmdb_*.c
 |files (or a single if you choose).

That i have seen, yes.

 |The database implementation needs some support logic in
 |bogofilter/configure.ac and bogofilter/src/Makefile.am

That surely is the very very hard part.  :)

 |I will happily accept a contribution, and help with integrating it
 |especially WRT autotools.

Which sounds as if it would be a major relief.  :)

 |Are you proposing to do the work or can propose someone who will?

I am in total stress, but of course i think i had to do it, then.
'Was yet thinking i should have contacted the list with
a completed patch.  Well.

 |I think it could build on a simplified version of
 |bogofilter/src/datastore_db_trans.c that implements fully ACID-compliant
 |Berkeley DB Transactional Database mode - with the log file, lock and
 |recovery stuff done away with, since that should not be necessary for
 |LMDB if I understand my (as in Fedora 28 Workstation's)
 |/usr/include/lmdb.h correctly.

Interestingly i only have a wordlist.db on AlpineLinux, whereas
before there were logs and multiple other things (which in fact
i have forgotten, and am too lazy to boot the old box)!?!

 |I have little spare time at my hands for now, so I can not currently
 |take on a new task.

Ok.  I cannot name a date, but i am very interested in LMDB
support for bogofilter, as i am also interested in LMDB usage in
Postfix, so i will definetely search and find time for looking in
writing a patch for bogofilter!  I will report back, then!

Thanks, Matthias, and a nice weekend,
Ciao,

--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter           he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)



More information about the bogofilter-dev mailing list