[bas at debian.org: bogofilter bug]

Matthias Andree matthias.andree at gmx.de
Mon Dec 15 01:43:27 CET 2003


Clint Adams <schizo at debian.org> writes:

> Bas tried to send this to bogofilter-dev, but, of course, the
> list's poster policy bounced his message, so I'm forwarding it for him.
>
> The problem is that when 3.2.9's stat is called with
> flags==DB_CACHED_COUNTS, the code where the pagesize is set is
> skipped over by a 'goto done;'.  I'm not sure if this is a
> documentation bug or a software bug, and I don't know
> how other 3.x versions behave.

This is a bug in datastore_db.c, introduced November 21 by Gyepi's
attempt to make the beast compile.

DB_CACHED_COUNTS behaves correctly and differs from DB_FAST_STAT in that
it doesn't fill in the page size. Removing DB_CACHED_COUNTS is way too
expensive, we'd read all pages in the data base just to find out the
page size, so we'll just assume a page size of 16k for BerkeleyDB 3.2
and older. It's only a file size warning margin, after all.

I don't dare use sysconf(PAGESIZE) or something at this time as a)
BerkeleyDB may have chosen differently, b) some random system will not
implement POSIX properly, choking on sysconf.

This is the pragmatic fix; patch (committed to CVS) is below. Please try
it and report back if it fixes the problem. I don't have BerkeleyDB 3.2
ready to test.

Index: src/datastore_db.c
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/datastore_db.c,v
retrieving revision 1.57
retrieving revision 1.58
diff -u -r1.57 -r1.58
--- src/datastore_db.c	9 Dec 2003 02:19:59 -0000	1.57
+++ src/datastore_db.c	15 Dec 2003 00:37:40 -0000	1.58
@@ -1,4 +1,4 @@
-/* $Id: datastore_db.c,v 1.57 2003/12/09 02:19:59 relson Exp $ */
+/* $Id: datastore_db.c,v 1.58 2003/12/15 00:37:40 m-a Exp $ */
 
 /*****************************************************************************
 
@@ -285,10 +285,6 @@
 	    /* query page size */
 #if DB_AT_LEAST(3,3)
 	    ret = dbp->stat(dbp, &dbstat, DB_FAST_STAT);
-#else
-	    ret = dbp->stat(dbp, &dbstat, NULL, DB_CACHED_COUNTS);
-#endif
-      
 	    if (ret) {
 		dbp->err (dbp, ret, "%s (db) stat: %s", progname, handle->name);
 		db_close(handle, false);
@@ -296,6 +292,15 @@
 	    }
 	    pagesize = dbstat->bt_pagesize;
 	    free(dbstat);
+#else
+	    /* The old, pre-3.3 API will not fill in the page size with
+	     * DB_CACHED_COUNTS, and without DB_CACHED_COUNTS,
+	     * BerlekeyDB will read the whole data base, incurring a
+	     * severe performance penalty. We'll guess a page size.
+	     * As this is a safety margin for the file size, we'll
+	     * rather choose it too large than too small. */
+	    pagesize = 16384;
+#endif
 
 	    check_fsize_limit(handle->fd[i], pagesize);
 
-- 
Matthias Andree

Encrypt your mail: my GnuPG key ID is 0x052E7D95




More information about the bogofilter-dev mailing list