0.93.1 and DB_CONFIG flags
Matthias Andree
matthias.andree at gmx.de
Thu Nov 11 01:21:09 CET 2004
David Relson <relson at osagesoftware.com> writes:
> Can do. I had already planned on #ifdefs using DB_LOG_AUTOREMOVE and
> DB_TXN_NOT_DURABLE to ease some of the problems with different versions
> of BerkeleyDB.
Warning, pitfall: DB_TXN_NOT_DURABLE is "available" (as C macro) in
several versions of bogofilter, but it is _only_ usable as environment
flag in version 4.2 but not any other version of Berkeley DB.
There is a difference between what you can feed DB_ENV->set_flags
(exposed in DB_CONFIG) and what you feed DB->set_flags (encapsulated,
must be set by the application with rare exceptions such as
DB_TXN_NOT_DURABLE in 4.2) although the flags share the same namespace
sometimes.
Run cvs update and see section 4.3 in doc/README.db to see the
differences. I dug that stuff out of the documentation for the various
Berkeley DB versions.
> After adding the needed code for bogofilter, I realized
> that bogoutil doesn't understand long options, e.g.
> --db_lk_max_locks=123, and doesn't know anything about bogofilter's
> config file. Guess it's time to cut/paste/revise bogoutil's option
> handling.
Nevermind. Long options for bogoutil is 0.93.2 stuff, the db_lk_* knobs
are accessible via DB_CONFIG, albeit not separately documented in the
new section yet. Will do that in the next few minutes.
We have several bugfixes worthy of a release on their own already.
If you can fix bogoutil, fine, but if you're taking my vote, let's do
0.93.1 soon. I haven't announced 0.93.0 on freshmeat yet for I do not
want to attract testers that can then run into the foul "automatic
recovery not working" bug I introduced last Tuesday.
>> There are other, less intrusive, options to address the write speed
>> issue; the most prominent is "don't use the -u option", the BerkeleyDB
>> options are DB_TXN_NOSYNC, DB_TXN_WRITE_NOSYNC and perhaps
>> DB_LOG_DSYNC, DB_DIRECT_DB, DB_DIRECT_LOG for machines with fast I/O
>> and slower memory copies, older SPARCS for instance.
>
> My bet is that the write speed issue Greg reported is for "bogoutil -l".
> I _know_ he doesn't use '-u'.
bogofilter -u is a hog and not well understood from the scoring/learning
point of view. The ostensive argumentation of "learning from known spam"
or such isn't founded on figures, and I don't dare offer my gut feeling
on this topic.
It is however well understood that -u entails write mode even for
scoring because of the subsequent registration, entails queueing behind
locks at least for the page holding .MSG_COUNT, log writes, data base
growth with unknown gains for the scoring accuracy.
I'd currently discourage the use of the -u option and if only for
performance reasons.
--
Matthias Andree
More information about the Bogofilter
mailing list