How to avoid s p lit up wor ds?

David Relson relson at osagesoftware.com
Fri Jan 17 21:29:37 CET 2003


At 02:44 PM 1/17/03, Chris Wilkes wrote:

>On Fri, Jan 17, 2003 at 02:31:51PM -0500, David Relson wrote:
> > At 02:28 PM 1/17/03, Chris Wilkes wrote:
> >
> > >I'm starting to get a lot of spam that looks like:
> > >        buy  to ner  car tri dg es
> > >where the bad words are split up into 2 or 3 letter words.  Since BF
> > >throws out those words it could get by.
> > >
> > >What can BF do to combat this?  Granted most spam list that has to
> > >contain a URL in it that can be caught.
> > >
> > >Maybe a simple frequency count of spam words vs larger ones would catch
> > >this?
> >
> > Are you running 0.9.1.2?  I'd like to see what the latest experimental
> > version (with mime processing) does with it.  There've been some 
> changes in
> > html handling that might make a difference with those messages.  Can you
> > send me a sample, preferably a whole message (zipped)?
>
>Unfortunately I've deleted those spams.  And thinking about it saying "a
>lot" might be a misnomer as it was only a couple but they stood out as
>interesting ones.
>
>Running 0.9.1.1 and only on tested emails so I can't say if BF caught
>the above spams.  I'm more interested from just a theoritical
>standpoint.
>
>I'll get the .2 one and look at the mime processing as about all my spam
>are html messages.  Since I'm using mutt I don't even see the message.

Do you remember what the spam was using to split up the words?  I'll do 
some experiments using bogolexer, but it'd be helpful to know what the 
original looked like.

0.9.1.2 is the latest stable version.  The mime processing code is 
presently only available from cvs on SourceForge.  If you want, I can make 
source and/or binary rpms available for you with the latest code.





More information about the Bogofilter mailing list