All mails have spamicity=0.5200000

David Relson relson at osagesoftware.com
Tue Dec 13 01:16:29 CET 2005


On Mon, 12 Dec 2005 14:57:22 +0000
Robin Bowes wrote:

> Hi,
> 
> I've recently (6th Dec) moved to bogofilter 1.0.0 and ended up starting
> again with an empty wordlist.
> 
> I'm running bogofilter from maildrop like this:
> 
> BOGOFILTER="/usr/bin/bogofilter"
> BOGOARGS="-e -p -u -d"
> BOGODIR=/path/to/home/dir/.bogofilter
> ...
> xfilter "${BOGOFILTER} ${BOGOARGS} ${BOGODIR}"
> 
> Since then, I've accepted around 3000 msgs. I'm manually training by
> dropping spam into a Spam/Undetected folder and processing this from a
> cron job using the following command:
> 
>  $BOGOFILTER -Ns -d $BOGODIR < $message
> 
> However, all msgs are still only getting a spamicity rating of 0.520000.
> 
> bogoutil -H wordlist.sb shows this:

...[snip]....

> It looks to me like something's not quite right.

Hi Robin,

Bogofilter auto-updates with messages that it thinks to be spam or
ham.  A score of 0.520000 indicates that bogofilter is discarding all
the tokens of the message, hence is left with the default score (0.52).

Probably your problem is that with _no_ initial training, bogofilter is
defaulting on all tokens, hence is adding _no_ tokens to the wordlist.
It's necessary to give it at least 1 message so that it can start
judging.  With this minimalist approach, the results will be pretty bad.
However, since your plan is to correct the (numerous) mistakes, you
should be OK -- after a while.

You can see how it's scoring a particular message by using "-vvv", as
in:

  bogofilter -vvv < msg

This will display all the tokens and their scores.  From that info
you'll be able to see what tokens were in the message and how
bogofilter scores each of them.  More detail on this output format is
in the FAQ.

HTH,

David




More information about the Bogofilter mailing list