Just saw a new spam tactic

Fri Jan 31 03:27:52 CET 2003

At 09:15 PM 1/30/03, Chris Wilkes wrote:

>On Thu, Jan 30, 2003 at 06:06:53PM -0800, Zack Brown wrote:
> > On Wed, Jan 29, 2003 at 11:40:28PM -0800, Max Rible wrote:
> > > I just got a piece of spam that's full of bogus HTML tags-- lots
> > > of </k> tags inserted in the middle of words.  The tags will be
> > > ignored by most HTML renderers, but will break up the text for
> > > spam parsing.
> >
> > Is it really necessary for Bogofilter to do anything about this? Won't
> > bogofilter just learn to classify email containing those kinds of tags
> > as spam?
>
>How many tags like that can you make?
>   <f> <ff> <fff> <fa> <faa>
>etc.  If you're sending out spam with fake tags in it you'll just make
>up different ones for each round.  BF will never see the same token
>twice.

One approach would be to have a list of known, valid tags and create tokens 
like "html:tag1", "html:tag2", etc.  This would distinguish usage as a tag 
from the same word being used as ordinary text.

I have also seen it suggested that urls and words in urls be used in the 
spamicity calculation.

... and there are probably a gazillion other things that can be done.  How 
useful they are is, as yet, unknown and untested.