DB backend support for lmdb?

Steffen Nurpmeso steffen at sdaoden.eu
Fri Jan 25 23:53:07 CET 2019


Hallo Matthias!

Matthias Andree wrote in <a4aecb7f-fa93-f918-dec4-a9a47926f1d9 at gmx.de>:
 |Am 17.01.19 um 19:06 schrieb Steffen Nurpmeso:
 |> Matthias Andree wrote in <c4f77995-7556-9fe0-e7e7-e6470fa6f715 at gmx.de>:
 ...
 |> Due to some thread on a NetBSD ML i had a glance on the LMDB code
 |> and saw that one overflow check condition was not right.  It
 |> cannot happen in practice because LMDB key length restriction is
 |> far below 31-bit (511 bytes by default), but still one overflow
 |> could happen and would not be detected because of wraparound.
 |> This is fixed by the attached patch.  (Tests all fine but the
 |> skipped ones.)
 |>
 |> Do you have any plans on doing a new bogofilter release?  It is of
 |> course nothing but purely selfish ... but KyotoCabinet is also
 |> a new backend, and is lingering for some time, too.  If i diff
 |> from my imported master (v1.2.4), there are quite some changes.
 |> (I definetely have seen releases with fewer changes.)
 |>
 |> No git yet it seems, but v1.2.4 seems to be more than five years
 |> old, hm hm.  v1.2.5 would be cool, anyway!
 ...
 |indeed no Git yet, real-life work went crazy last autumn as in really
 |crazy, so I pushed everything back quite a bit.

Yes, of course.  You have been talking about it, that is why.
I normally do not have all the auto* series installed, so my
personal interest tends to look in direction pre-prepared ball...

 |Thanks for proactively maintaining and catching up your LMDB support so
 |we can refine it!

I was only fixing a bad overflow check, which should have been
avoided from the start.  If it would be that easy.

 |However, can I ask you to resend the patch without the many
 |reformattings? There's a lot of
 |
 |foo *     ->    foo*
 |
 |reformatting going on that obscures the actual change.

Mist!  You recognized it!!!  I have now compiled with C++ compiler
by myself, to proactively avoid that you have to add more of that
terrible "foo *" syntax instead of my beautiful "foo*" notation.
Fixed in the below.

 |And yes, a new release is planned, but not all changes that I feel are
 |needed are in yet.

Ok.  But the silence on this list does not mean anything but that
people in Germany, and likely in Taiwan and such, are nonetheless
waiting and hoping for a new release.  ^_^

 |Thanks again!

Thank you Matthias.  I say Ciao! already here, i had forgotten
MIME attachments get scrubbed...

diff --git a/bogofilter/src/datastore_lmdb.c b/bogofilter/src/datastore_lmdb.c
index b8c4015..681db24 100644
--- a/bogofilter/src/datastore_lmdb.c
+++ b/bogofilter/src/datastore_lmdb.c
@@ -5,7 +5,7 @@
  * datastore_lmdb.c -- implements the datastore, using LMDB.
  *
  * AUTHORS:
- * Steffen Nurpmeso <steffen at sdaoden.eu>    2018
+ * Steffen Nurpmeso <steffen at sdaoden.eu>    2018, 2019
  * (copied from datastore_kc.c:
  * Gyepi Sam <gyepi at praxis-sw.com>          2003
  * Matthias Andree <matthias.andree at gmx.de> 2003, 2018
@@ -24,11 +24,11 @@
  *    reaches the size limit, the transaction must be aborted, then the
  *    environment must be resized, then a new transaction has to be created.
  *    Resizing will not shrink, effectively.
- * 3. We assume xmalloc() aborts if out of memory.
- * 4. We assume no token->leng actually exceeds int32_t.
- * 5. mdb_env_get_maxkeysize():
+ * 3. mdb_env_get_maxkeysize():
  *      Depends on the compile-time constant #MDB_MAXKEYSIZE. Default 511.
  *    We reject any keys which excess this.
+ * 4. We assume xmalloc() aborts if out of memory.
+ * 5. We assume no token->leng actually exceeds int32_t.
  *
  * In order to be able to deal with 2. we need to track all changes that are
  * performed in a txn, so that in case we are running against the wall we are
@@ -114,8 +114,8 @@ struct a_bflm{
 struct a_bflm_txn_cache{
     struct a_bflm_txn_cache *bflmtc_last;   /* Up-to-date (stack usage) */
     struct a_bflm_txn_cache *bflmtc_next;   /* Needs to be build before use! */
-    char *bflmtc_caster;    /* Current caster */
-    char *bflmtc_max;       /* Maximum usable byte, exclusive */
+    char *bflmtc_caster;                    /* Current caster */
+    char *bflmtc_max;                       /* ..imum usable byte, exclusive */
     /* Actually points to &self[1] TODO [0] or [8], dep. __STDC_VERSION__! */
     char *bflmtc_data;
 };
@@ -615,19 +615,25 @@ a_bflm_txn_cache_put(struct a_bflm *bflmp, MDB_val *key, MDB_val *val_or_null){
     char const *emsg;
     size_t kl, vl, i;
 
-    kl = key->mv_size;
+    if((kl = key->mv_size) >= 0x7FFFFFFFu)
+        goto jeoverflow;
     if(val_or_null != NULL){
-        vl = val_or_null->mv_size;
-        i = (2 * sizeof(uint32_t)) + kl + vl;
+        if((vl = val_or_null->mv_size) >= 0x7FFFFFFFu)
+            goto jeoverflow;
+        if((i = kl + vl) >= 0x7FFFFFFFu - 2 * sizeof(uint32_t))
+            goto jeoverflow;
+        i += 2 * sizeof(uint32_t);
     }else{
         vl = 0;
-        i = sizeof(uint32_t) + kl;
+        if((i = kl) >= 0x7FFFFFFFu - sizeof(uint32_t))
+            goto jeoverflow;
+        i += sizeof(uint32_t);
     }
     i = a_BFLM_TXN_CACHE_ALIGN(i);
 
     /* XXX We actually should abort() the program instead: cannot be handled */
-    if(kl >= 0x7FFFFFFFu || vl >= 0x7FFFFFFFu ||
-            i >= 0x7FFFFFFFu - sizeof(*bflmtcp)){
+    if(i >= 0x7FFFFFFFu - sizeof(*bflmtcp)){
+jeoverflow:
         emsg = "LMDB: entry too large to be stored";
         goto jleave;
     }

--steffen
|
|Der Kragenbaer,                The moon bear,
|der holt sich munter           he cheerfully and one by one
|einen nach dem anderen runter  wa.ks himself off
|(By Robert Gernhardt)


More information about the bogofilter-dev mailing list