Ways to trick the lexer

Thomas Anderson tanderso at oac-design.com
Fri Jun 8 23:50:09 CEST 2007


As the others have basically said, something you may consider to be a
cunning trick at first actually loses its effectiveness rather quickly.
Just train on them.  Come back if they're still causing problems in a
few days or weeks (depending on how many you get a day).

Tom

On Fri, 2007-06-08 at 22:21 +0200, Andreas Pardeike wrote:
> Hi,
> 
> I am getting hundreds of spams with subject "Sexually explicit"
> variations. The create tokens like
> 
> subj:SEIX8UALLY-E8XPLICITI
> 
> in the database and since they vary in at least one letter from
> each other, they all get counts of 1. As a result, none of those
> seemingly random letter will get high spam scores.
> 
> Is this behaviour intented? Wouldn't a higher word count by splitting
> on more boundaries result in i.e.
> 
> subj:UALLY
> ...
> 
> or at least
> 
> subj:SEIX8UALLY
> subj:E8XPLICITI
> 
> ?
> 
> Regards,
> Andreas Pardeike
> _______________________________________________
> Bogofilter mailing list
> Bogofilter at bogofilter.org
> http://www.bogofilter.org/mailman/listinfo/bogofilter




More information about the Bogofilter mailing list