Strange bogolexer result

David Relson relson at osagesoftware.com
Wed Apr 9 14:12:23 CEST 2003


At 07:55 AM 4/9/03, Boris 'pi' Piwinger wrote:

>Hi!
>
>I was wondering how much the use of spam markers set by some
>mail server next to mine is correct. So I wanted to look up
>whe values in the database. So taking a typical header:
>
>[3.14 at pi ~]$ echo "X-Spam-Flags:
><.POSTMASTER-RFC-IGNORANT.PIGS.PIGS-UNKNOWN. at mx2.univie.ac.at>[195.34.187.2:detebe.org]<lycos.com>"
>|bogolexer
>normal mode.
>get_token: 1 'x-spam-flags`
>get_token: 1 'mx2.univie.ac.at`
>get_token: 5 '195.34.187.2`
>get_token: 1 'detebe.org`
>get_token: 1 'lycos.com`
>5 tokens read.
>
>That says, that the information is not looked at, it looks
>like an e-mail address, but why?
>
>pi

pi,

It had me puzzled, too.  So I rebuilt the lexer with "debug" enabled and 
ran it and saw it accepting "POSTMASTER-RFC-IGNORANT.PIGS.PIGS-UNKNOWN" as 
a token.  Then I realized.  It's too long!  I had temporarily forgotten 
that bogofilter discards tokens longer than 30 characters.  The userid is 
41 chars.

David






More information about the Bogofilter mailing list