Just saw a new spam tactic
David Relson
relson at osagesoftware.com
Fri Jan 31 03:27:52 CET 2003
At 09:15 PM 1/30/03, Chris Wilkes wrote:
>On Thu, Jan 30, 2003 at 06:06:53PM -0800, Zack Brown wrote:
> > On Wed, Jan 29, 2003 at 11:40:28PM -0800, Max Rible wrote:
> > > I just got a piece of spam that's full of bogus HTML tags-- lots
> > > of </k> tags inserted in the middle of words. The tags will be
> > > ignored by most HTML renderers, but will break up the text for
> > > spam parsing.
> >
> > Is it really necessary for Bogofilter to do anything about this? Won't
> > bogofilter just learn to classify email containing those kinds of tags
> > as spam?
>
>How many tags like that can you make?
> <f> <ff> <fff> <fa> <faa>
>etc. If you're sending out spam with fake tags in it you'll just make
>up different ones for each round. BF will never see the same token
>twice.
One approach would be to have a list of known, valid tags and create tokens
like "html:tag1", "html:tag2", etc. This would distinguish usage as a tag
from the same word being used as ordinary text.
I have also seen it suggested that urls and words in urls be used in the
spamicity calculation.
... and there are probably a gazillion other things that can be done. How
useful they are is, as yet, unknown and untested.
More information about the Bogofilter
mailing list