corrupted db files?

David Relson relson at osagesoftware.com
Tue Dec 31 19:19:02 CET 2002


At 01:01 PM 12/31/02, Fletcher Mattox wrote:

> > Fletcher,
> >
> > There's definitely something wrong.  A token's count should be less than
> > the number of messages (MSG_COUNT) processed (if using the Robinson or
> > Robinson-Fisher methods) or less than 4 times MSG_COUNT (if using Graham
> > method).
>
>I am using the default method (no config file), which I think is Robinson.

Correct.  The default is Robinson.  I've been using '-f', aka 
Robinson-Fisher, for the last three weeks and think it's ability to 
classify messages as spam, ham, or unsure is terrific.  It has been totally 
accurate in its ham and spam classifications.  Most of the messages 
classified as unsure are actually spam, but there aren't many of them.  I 
can't imagine going back to the older binary classification.

> > Also, all calls to db_setvalue() check for negative values and
> > replace them by 0.  You should never see huge values like you're reporting.
>
>Interestingly, bogofilter still works.  Or at least is not badly broken.
>I had run a week like this before I even noticed it.

Since bogofilter looks at _all_ the tokens of a message, having a few 
tokens with b0rken values doesn't matter.

> > What version of bogofilter are you running?  If not the latest stable
> > version, i.e. 0.9.1.2, you should upgrade for all the newest features and
> > the best code.
>
>0.9.1.2,

That's the right answer.

> > If you start with a fresh database can you reproduce the problem?
>
>It has happened twice now.  Both times took about a week to manifest.
>
>I wonder if it's a file locking problem.  I'm running on Solaris 7,
>where read locking always fails:
>
>         cs.utexas.edu$ ./bogofilter -v -d . </etc/motd
>         [8096] [0] Faked blocking read lock on ./goodlist.db
>         [8096] [1] Faked nonblocking read lock on ./spamlist.db
>         X-Bogosity: No, tests=bogofilter, spamicity=0.000415, version=0.9.1.2
>         cs.utexas.edu$
>
>I don't know why this happens (or if it's serious), but it appears
>that fcntl() is returning EAGAIN.  It does not happen under linux.
>However, I don't see how failure of read locking could corrupt the db.
>I have never noticed lock failure while writing to the database, but
>I could easily have missed that.

Fletcher,

I'll let Gyepi and Matthias handle this one.  They know more about 
BerkeleyDB and Solaris than do I.

I've had some problems on linux with database contention, resulting in 
"failed to acquire lock" messages.  datastore_db.c has a delay at the end 
of it's locking acquisition loop, but the delay time is short (although the 
comment indicates otherwise).  The delay has been fixed since 0.9.1.2 and 
my locking failures are now followed by an X-Bogosity status message (after 
a few seconds).

David







More information about the Bogofilter mailing list