Wanting a pre-db4 bogofilter
matthias.andree at gmx.de
Fri Feb 25 04:21:47 EST 2005
Jef Poskanzer <jef at acme.com> writes:
>>I can't see why you couldn't get by with table or row locks - do you really
>>need rollback (the reason for the log files)?
> I think this is a very important point. We don't need transactions
> for bogofilter.
Yes, we do. See below.
> We don't need A, C, or D; all we need is I, which
> is simply a matter of locking.
We *need* Consistency, Isolation and Durability, and we want Atomicity
> We don't even need row locking, a full-file lock would be just fine.
No, we need fine-grained locks. Full-file locks have regularly caused
complaints about many readers (scorers) locking out training and vice
versa, a full-file lock is insufficient.
> Matthias Andree <matthias.andree at gmx.de>:
>>The Atomicity trait is helpful as we need to change either all of the
>>.MSG_COUNT token and the individual tokens or none, for accuracy.
> So the worst case you could come up with in a non-atomic system is
> that after a system crash, some of the counts might be off by one?
> We can live with that. The database would still be usable, that's
> what matters.
False on all three assertions/conclusions:
1. after a crash, the database may be totally unusable:
non-transactional Berkeley DB does not guarantee anything after a
crash, be that application or system crash. Note SIGINT or SIGHUP or
SIGPIPE also "crash" bogofilter for this definition.
2. when training larger message counts, say, a whole folder, the counts
may be off way more than by one, they may be off by more than 100% of
what had been registered before.
3. We cannot live with a trivial situations trashing the database.
Bogofilter mailing list
Bogofilter at bogofilter.org
More information about the Bogofilter