0.93.1 and DB_CONFIG flags

Matthias Andree matthias.andree at gmx.de
Thu Nov 11 01:21:09 CET 2004


David Relson <relson at osagesoftware.com> writes:

> Can do.  I had already planned on #ifdefs using DB_LOG_AUTOREMOVE and
> DB_TXN_NOT_DURABLE to ease some of the problems with different versions
> of BerkeleyDB.

Warning, pitfall: DB_TXN_NOT_DURABLE is "available" (as C macro) in
several versions of bogofilter, but it is _only_ usable as environment
flag in version 4.2 but not any other version of Berkeley DB.

There is a difference between what you can feed DB_ENV->set_flags
(exposed in DB_CONFIG) and what you feed DB->set_flags (encapsulated,
must be set by the application with rare exceptions such as
DB_TXN_NOT_DURABLE in 4.2) although the flags share the same namespace
sometimes.

Run cvs update and see section 4.3 in doc/README.db to see the
differences. I dug that stuff out of the documentation for the various
Berkeley DB versions.

> After adding the needed code for bogofilter, I realized
> that bogoutil doesn't understand long options, e.g.
> --db_lk_max_locks=123, and doesn't know anything about bogofilter's
> config file.  Guess it's time to cut/paste/revise bogoutil's option
> handling.

Nevermind. Long options for bogoutil is 0.93.2 stuff, the db_lk_* knobs
are accessible via DB_CONFIG, albeit not separately documented in the
new section yet. Will do that in the next few minutes.

We have several bugfixes worthy of a release on their own already.

If you can fix bogoutil, fine, but if you're taking my vote, let's do
0.93.1 soon. I haven't announced 0.93.0 on freshmeat yet for I do not
want to attract testers that can then run into the foul "automatic
recovery not working" bug I introduced last Tuesday.

>> There are other, less intrusive, options to address the write speed
>> issue; the most prominent is "don't use the -u option", the BerkeleyDB
>> options are DB_TXN_NOSYNC, DB_TXN_WRITE_NOSYNC and perhaps
>> DB_LOG_DSYNC, DB_DIRECT_DB, DB_DIRECT_LOG for machines with fast I/O
>> and slower memory copies, older SPARCS for instance.
>
> My bet is that the write speed issue Greg reported is for "bogoutil -l".
> I _know_ he doesn't use '-u'.

bogofilter -u is a hog and not well understood from the scoring/learning
point of view. The ostensive argumentation of "learning from known spam"
or such isn't founded on figures, and I don't dare offer my gut feeling
on this topic.

It is however well understood that -u entails write mode even for
scoring because of the subsequent registration, entails queueing behind
locks at least for the page holding .MSG_COUNT, log writes, data base
growth with unknown gains for the scoring accuracy.

I'd currently discourage the use of the -u option and if only for
performance reasons.

-- 
Matthias Andree



More information about the Bogofilter mailing list