Getting "nan" in my verbose output

Chris Wilkes cwilkes-bf at ladro.com
Thu Apr 8 21:51:26 CEST 2004


On Thu, Apr 08, 2004 at 03:30:55PM -0400, David Relson wrote:
> On Thu, 8 Apr 2004 11:17:36 -0700
> Chris Wilkes wrote:
> 
> > Hi all,
> > 
> >   I just upgraded to version 0.17.5 and am looking at one of my user's
> > "makespam" folders where they dump email to be classified as spam.
> >   What's odd is that the spam values of the emails were all around 0.5
> > until I ran them through a "-Ns" and then suddenly the spam count was
> > 0! Digging deeper I found the problem : my -Ns reduced the total good
> > count to 0.
> > 
> > Looking at one particular message I found the word "Pharmacy" in it.
> >   
> >   $ bogofilter -vvv -I bademail.txt | grep -i Pharmacy
> >    Word            n    pgood      pbad    fw  U
> >   "Pharmacy"      86      nan  0.002943   nan  -
> >   $ bogoutil -w ./wordlist.db Pharmacy
> >                   spam   good
> >      Pharmacy       86      0
> >   $ bogoutil -w ./wordlist.db .MSG_COUNT
> >                   spam   good
> >     .MSG_COUNT   29226      0
> > 
> > What gives?  This word has only been seen in spams (86 to 0) yet it
> > doesn't contribute to the spam count ("-" for U).
> 
> Did you do a whole bunch of "-Ns" to cause the zero???  Do you think the
> good count went to zero appropriately, or inappropriately?
> 
> As you've discovered the "nan" is a result of a zero message count.
> Bogofilter uses the message count as a divisor and the division by zero
> causes the problem.  As a minor speedup, the "if 0, use 1 for division"
> check was deleted a while ago.  I'll modify the code and fix the code to
> fix this for the next release.

Yes, you are correct, I did a bunch of -s's in a row on all the email
that she dumped into "makespam"   After doing so I take out the ones
that didn't register as spam and then do a -Ns on them (probably more
than once too).  Since she gets a lot more spam than ham doing so could
be problematic as lot of people here dump correctly classified spam into
the makespam.

Now I'll just stick to doing a -s.  Since I keep track of everyone's
"makegoods" I'm running those through everyone's wordlist with a -n to
ward off this problem.

Thanks for putting the check back in there.  Hey, it could of been worse
and my -N caused the good .MSG_COUNT to go negative :)

Chris




More information about the Bogofilter mailing list