Just saw a new spam tactic

Nick Simicich njs at scifi.squawk.com
Thu Jan 30 17:44:48 CET 2003

At 12:08 AM 2003-01-30 -0800, Chris Wilkes wrote:

>On Wed, Jan 29, 2003 at 11:40:28PM -0800, Max Rible wrote:
> > I just got a piece of spam that's full of bogus HTML tags-- lots
> > of </k> tags inserted in the middle of words.  The tags will be
> > ignored by most HTML renderers, but will break up the text for
> > spam parsing.
>Could you post what version of bogofilter you're using?  The latest one
>does include code to through out HTML comments like this:
>   to<!-- -->ner cart<!-- -->ridge
>and give you what you would see in a browser, mainly:
>   toner cartridge
>I'm not sure about bogus HTML tags though.  It would be nice to get some
>sort of number representing how poorly writen an HTML page is.

I am beginning to wonder if bogofilter's actions should not be to simply 
remove all tags and then break the remaining string on white 
space.  Nothing will allow you to reconstruct the mail that arranges parts 
of words as tables (other than rendering and then interpreting by 
"eyespace"), but my mailer won't render those anyway (on purpose).  The 
point is well made, by Max:  Does it really matter whether the thing we are 
looking at is split by a comment or by a tag that will be ignored?  Just 
pull anything out of the string that will be tokenized that is between < > 
rather than <!-- -->.

SPAM: Trademark for spiced, chopped ham manufactured by Hormel.
spam: Unsolicited, Bulk E-mail, where e-mail can be interpreted generally 
to mean electronic messages designed to be read by an individual, and it 
can include Usenet, SMS, AIM, etc.  But if it is not all three of 
Unsolicited, Bulk, and E-mail, it simply is not spam. Misusing the term 
plays into the hands of the spammers, since it causes confusion, and 
spammers thrive on  confusion. Spam is not speech, it is an action, like 
theft, or vandalism. If you were not confused, would you patronize a spammer?
Nick Simicich - njs at scifi.squawk.com - http://scifi.squawk.com/njs.html
Stop by and light up the world!

More information about the Bogofilter mailing list