Passthrough mode - X-Bogosity header & its effect on the corpus

David Relson relson at osagesoftware.com
Mon Dec 1 13:29:22 CET 2003


On Mon, 1 Dec 2003 11:34:17 +0000
Stroller <Linux.Luser at myrealbox.com> wrote:

...[snip]...
> 
> My concern is that when I move this message to my "Spam/Definite" 
> folder & use it to teach Bogofilter, then that header will affect the 
> corpus. I guess there are a number of possible repercussions - that 
> Bogofilter will expect an incoming message without the "X-Bogosity: 
> Yes" line to be ham, that the word "Yes" has an increased bogosity, or
> 
> just that my Bogofilter database will get filled up with 
> "spamicity=0.x" entries.
> 
> Does Bogofilter ignore X-Bogosity headers..? I can see no mention of 
> this in the manpage.

Greetings Stroller,

>From your query it sounds like you might still be using 0.13.7.2.  If
so, I'd suggest upgrading to the current stable release which is 0.15.7.
 It has lots of improvements, does a better job, and is faster.  Also my
response assumes the current release, so may be slightly "off" with
regards to an old release.

You can see what's in the wordlist by using command "bogoutil -d
wordlist.db".  If you want to check for particular tokens, use "bogoutil
-w wordlist.db token1 token2" or "cat list_of_tokens | bogoutil -w
wordlist.db".  Worth noting is that '=' is a separator character and is
_never_ part of a token and that bogofilter ignores purely numeric
tokens.  "spamicity=0.123456" becomes single token "spamicity".

All that being said, bogofilter _does_ ignore the header "X-Bogosity:"
line.  It sounds like something I should add to the FAQ :-)

Hope this helps!

David




More information about the Bogofilter mailing list