Bogofilter seems to not be working

daniel djoneill at gmx.net
Tue Mar 25 22:15:57 CET 2003


So wrote David Relson on Tuesday 25 March 2003 at 04:03:59PM -0500:
> Hello Daniel,
> 
> There are several things you can do to check.  First, run bogofilter from 
> the command line with differing numbers of "-v" (verbose) flags to learn 
> more:
> 
> bogofilter -v < message --- will print the "X-Bogosity line for the message"
> bogofilter -vv < message --- will print a histogram showing token counts 
> vs. spam scores.
> bogofilter -vvv < messagee --- will list _all_ the tokens of the message 
> and their spam scores.

Sorry to sound so dense, yet I am using Mutt with standard mbox mail folders.  Therefore each message is just a string of text in a big file.  How can I run bogofilter on a particular message from the command line like this (I have tried piping to shell command from withing Mutt via the ! command yet this does not pipe the message)?

> 
> For words that you think are highly spammish, use bogoutil to display their 
> spam/good counts and spam scores, i.e.:
> 
> bogoutil -w ~/.bogofilter word1 word2 word3
> 
> The value "0.415000" is the score of a token not in either the ham or spam 
> wordlists.  When it appears as the score of a message, it indicates that 
> bogofilter isn't finding the message's tokens in the wordlists.  Use 
> "bogofilter -qv" to see what parameters bogofilter is working with and use 
> "bogofilter -x d -v </dev/null" to print some debug info (which will 
> include the names of the wordlists).
> 

Here is a listing:
 bogoutil -p -w ~/.bogofilter mortgage investment
                       spam    good  Gra prob  Rob prob
mortgage                 12      15  0.651841  0.649409
investment                1       2  0.400000  0.527607

both these words should be strong indicators of spam since I do not think I have any good e-mails that contain either word.
What concerns me is that the good MSG COUNT is incrementing by one when mail is retreived even when the mail is spam.  So if I do a <esc>-d on it the spam MSG COUNT increments by one yet that message has already been registered on the good COUNT.  This seems self-defeating.  It seems to me that <esc>-d should not only increment the spam COUNT by one and add all words in the given message to the bad list, it also should remove it from the good COUNT and all its words.

Also, I just removed the "tests=bogofilter" string from the procmail testing line and also the -u option but neither of these has had any effect.  Spamicity is still 0.000000.


-- 
Daniel O'Neill
415-865-0923
djoneill at gmx.net




More information about the Bogofilter mailing list