Wanting a pre-db4 bogofilter

Matthias Andree matthias.andree at gmx.de
Fri Feb 25 10:21:47 CET 2005


Jef Poskanzer <jef at acme.com> writes:

>>I can't see why you couldn't get by with table or row locks - do you really
>>need rollback (the reason for the log files)?
>
> I think this is a very important point.  We don't need transactions
> for bogofilter.

Yes, we do. See below.

> We don't need A, C, or D; all we need is I, which
> is simply a matter of locking.

We *need* Consistency, Isolation and Durability, and we want Atomicity
as well.

> We don't even need row locking, a full-file lock would be just fine.

No, we need fine-grained locks. Full-file locks have regularly caused
complaints about many readers (scorers) locking out training and vice
versa, a full-file lock is insufficient.

> Matthias Andree <matthias.andree at gmx.de>:
>>The Atomicity trait is helpful as we need to change either all of the
>>.MSG_COUNT token and the individual tokens or none, for accuracy.
>
> So the worst case you could come up with in a non-atomic system is
> that after a system crash, some of the counts might be off by one?
> We can live with that.  The database would still be usable, that's
> what matters.

False on all three assertions/conclusions:

1. after a crash, the database may be totally unusable:
   non-transactional Berkeley DB does not guarantee anything after a
   crash, be that application or system crash. Note SIGINT or SIGHUP or
   SIGPIPE also "crash" bogofilter for this definition.

2. when training larger message counts, say, a whole folder, the counts
   may be off way more than by one, they may be off by more than 100% of
   what had been registered before.

3. We cannot live with a trivial situations trashing the database.

-- 
Matthias Andree
_______________________________________________
Bogofilter mailing list
Bogofilter at bogofilter.org
http://www.bogofilter.org/mailman/listinfo/bogofilter



More information about the Bogofilter mailing list