Wanting a pre-db4 bogofilter

Matthias Andree matthias.andree at gmx.de
Sat Feb 26 02:32:32 CET 2005


Tom Allison <tallison at tacocat.net> writes:

> David Relson wrote:
>> H'lo Karl,
>> As a point of information, bogofilter has been using BerkeleyDB 4.x for
>> quite a while.  What changed with 0.93.0 is that bogofilter started
>> using BerkeleyDB's transaction capability.
>>
> [snip]
>> In a few days I hope to be releasing 0.94.0 which will have run-time
>> switches to tell bogofilter (and bogoutil) to use transaction (or
>> non-transaction) mode.  Based on the survey of a few weeks back,
>> bogofilter's users prefer the simpler non-transactional mode to the
>> transactional mode and are willing to accept greater danger of database
>> corruption in exchange for simpler database administration.
>
> Sounds to me like we all need to read, "Who Moved my Cheese?"
>
> Let me see if I get this right as a "where are we now and where's that 
> cheese gone to anyways?" ::
>
> Bogofilter has been using version 4 for a while now.

Bugger it! The next person to talk about Berkeley DB 4 (or adumbrations
thereof) without my prior authorization will be unsubscribed from the
mailing list. :-P

It's Transactional Data Store (call it TXN or XA, I don't mind) versus
plain stock vanilla traditional old-fashioned crash-prone Data
Store. Once and for good.

> It's only been difficult since it incorporated the transactional 
> capabilities.

Yes indeed, the log files have caused maintenance pains that are
resolved with the next bogofilter release in its default setting.

> Does this mimic really big fat huge databases like the ever flame-war 
> capable MySQL vs PostgreSQL?

Not in the least. MySQL has long been a wimpy webmaster toy of people
who needed something done quick and then run, who cares about
completeness of implementation or guarantees.

Seriously, all it does is make sure bogofilter has made a decent, honest
attempt at diligently handling user data, and avoiding surprises and
corruption such as half training and such.

> You betch ya!  But it's that little
> scrawny database in the corner that doesn't do the networking, ACL, 
> SSL-tunneling, threading and all that other cool database stuff.

Inter-Process communication, SSL certificate handling, threading are the
beasts that impair portability, robustness, and speed.

> So why has bogofilter been such a total pain in the butt?
>
> Because it's open source and we're spoiled?

Perhaps because there has been an honest attempt to be more careful with
user data. Turns out that the "catastrophic recovery" is hardly asked
for, which is the only use for keeping all logs. "regular recovery"
after a crash, when your hardware was undamaged, only requires active
logs.

> I was a personal source of many dumb questions right after the 
> transactional upgrade.

Fine. The only "dumb" question is the question asked without checking
the docs first.

If the docs left you like a dying duck in the thunderstorm, fine, our
task to fix the docs, and they ARE lacking - and that's where users who
want to help but cannot hack C can help: check if documentation, --help
output, actual behavior are consistent and report bugs.

WRT transactional code, log file handling was the biggest issue, not
telling users update procedures, partially because we ourselves had too
little experience with Berkeley DB Transactional mode in corner cases,
have been further problems. Docs say one thing, real use another, and
resizing the environment is yet another.

> I got it because I decided it might be interesting to play with.
> Problem is, bogofilter upgrades have been so utterly routine for so
> many years I just wasn't prepared to read the man pages that night.

Oh, two is many? :-) Users not reading RELEASE.NOTES or NEWS is nothing
we can do anything about, except not release new versions...

-- 
Matthias Andree
_______________________________________________
Bogofilter mailing list
Bogofilter at bogofilter.org
http://www.bogofilter.org/mailman/listinfo/bogofilter



More information about the Bogofilter mailing list