possible message count corruption?

Jeremy Blosser jblosser-bogofilter at firinn.org
Thu Sep 19 14:51:34 CEST 2002


As mentioned previously, we're running some basic stress tests using
bogofilter to classify our entire incoming mail load as spam, and something
interesting happened.  Note that we're using 0.7.3, so it's possible this
is known and fixed; we're looking seriously at deploying this in production
within weeks, so we grabbed the latest version that was not explicitly
tagged 'beta' on sf.net.

Anyway.  For each message we're running:
bogofilter -v >> bogofilter.log; bogofilter -s -v >> bogofilter.log
This ran fine for most of yesterday, but then last night I was glancing at
the log and saw this:

% grep '^bogofilter:' bogofilter-log
bogofilter: 4172 messages on the spam list
...
bogofilter: 5294 messages on the spam list
bogofilter: 5295 messages on the spam list
bogofilter: 5296 messages on the spam list
bogofilter: 5297 messages on the spam list
bogofilter: 5298 messages on the spam list
bogofilter: 1 messages on the spam list
bogofilter: 2 messages on the spam list
bogofilter: 3 messages on the spam list
bogofilter: 4 messages on the spam list
bogofilter: 5 messages on the spam list
...
bogofilter: 13054 messages on the spam list

The 'increment' lines around the point the message count reset don't
indicate that the word counts themselves reset.  I don't believe the count
reset before this, but I don't have a log going back to the beginning (as
fast as it grows, I've been nuking the log every few hours; I was more
interested in tracking the system load and db sizes and was mostly logging
to artificially inflate the load).

Anyway, if this is a new or interesting report let me know and I can
provide more information.  The entire log from that period is 120MB
unzipped, 12MB bzipped.

For summay digest subscription: bogofilter-digest-subscribe at aotto.com



More information about the Bogofilter mailing list