corrupted db files?
David Relson
relson at osagesoftware.com
Tue Dec 31 17:53:11 CET 2002
Fletcher,
There's definitely something wrong. A token's count should be less than
the number of messages (MSG_COUNT) processed (if using the Robinson or
Robinson-Fisher methods) or less than 4 times MSG_COUNT (if using Graham
method). Also, all calls to db_setvalue() check for negative values and
replace them by 0. You should never see huge values like you're reporting.
What version of bogofilter are you running? If not the latest stable
version, i.e. 0.9.1.2, you should upgrade for all the newest features and
the best code.
If you start with a fresh database can you reproduce the problem?
David
At 11:24 AM 12/31/02, Fletcher Mattox wrote:
>Occasionally the word count values in my db files get very large.
>Many of them are near 2^32, perhaps suggesting integer overflow
>of some type.
... [snip] ...
>There probably are 3369 messages in this db, so maybe this is just
>an artifact of the way Berkeley DB works? Should I be concerned?
I'm not sure what the upper limits are, but I've personally got 10x as many
messages and am sure that BerkeleyDB can handle millions. Likely the
number is 2^32-1 or something like that. I'm sure you're not bumping into
a DB limit, but there is cause for concern. Big numbers like that are WRONG.
David
More information about the Bogofilter
mailing list