Getting "nan" in my verbose output
David Relson
relson at osagesoftware.com
Thu Apr 8 21:30:55 CEST 2004
On Thu, 8 Apr 2004 11:17:36 -0700
Chris Wilkes wrote:
> Hi all,
>
> I just upgraded to version 0.17.5 and am looking at one of my user's
> "makespam" folders where they dump email to be classified as spam.
> What's odd is that the spam values of the emails were all around 0.5
> until I ran them through a "-Ns" and then suddenly the spam count was
> 0! Digging deeper I found the problem : my -Ns reduced the total good
> count to 0.
>
> Looking at one particular message I found the word "Pharmacy" in it.
>
> $ bogofilter -vvv -I bademail.txt | grep -i Pharmacy
> Word n pgood pbad fw U
> "Pharmacy" 86 nan 0.002943 nan -
> $ bogoutil -w ./wordlist.db Pharmacy
> spam good
> Pharmacy 86 0
> $ bogoutil -w ./wordlist.db .MSG_COUNT
> spam good
> .MSG_COUNT 29226 0
>
> What gives? This word has only been seen in spams (86 to 0) yet it
> doesn't contribute to the spam count ("-" for U).
Chris,
Did you do a whole bunch of "-Ns" to cause the zero??? Do you think the
good count went to zero appropriately, or inappropriately?
As you've discovered the "nan" is a result of a zero message count.
Bogofilter uses the message count as a divisor and the division by zero
causes the problem. As a minor speedup, the "if 0, use 1 for division"
check was deleted a while ago. I'll modify the code and fix the code to
fix this for the next release.
David
More information about the Bogofilter
mailing list