Getting "nan" in my verbose output
Chris Wilkes
cwilkes-bf at ladro.com
Thu Apr 8 20:17:36 CEST 2004
Hi all,
I just upgraded to version 0.17.5 and am looking at one of my user's
"makespam" folders where they dump email to be classified as spam.
What's odd is that the spam values of the emails were all around 0.5
until I ran them through a "-Ns" and then suddenly the spam count was 0!
Digging deeper I found the problem : my -Ns reduced the total good count
to 0.
Looking at one particular message I found the word "Pharmacy" in it.
$ bogofilter -vvv -I bademail.txt | grep -i Pharmacy
Word n pgood pbad fw U
"Pharmacy" 86 nan 0.002943 nan -
$ bogoutil -w ./wordlist.db Pharmacy
spam good
Pharmacy 86 0
$ bogoutil -w ./wordlist.db .MSG_COUNT
spam good
.MSG_COUNT 29226 0
What gives? This word has only been seen in spams (86 to 0) yet it
doesn't contribute to the spam count ("-" for U).
Granted this is an extreme case brought about by my -N a couple of
times. I'll have to look into a way of protecting against this as my
"makespam" script does that.
Chris
More information about the Bogofilter
mailing list