What to do for HTML comment processing ???

Nick Simicich njs at scifi.squawk.com
Thu Mar 6 08:41:31 CET 2003


At 05:50 PM 2003-03-04 -0500, David Relson wrote:

>Nick,
>
>Your feed back on processing html comments is appreciated.  May I suggest 
>a compromise?

Why?  Unless you have a REASON!  Personal preference is not a good 
reason.  Because we have done it that way was not good enough to block the 
flag changes.

And "because it is a standard" is not a good reason.  NO ONE FOLLOWS STANDARDS.

>As the default have bogofilter follow the standard in processing html 
>comments and also have an "aggressive comment" mode that would be more 
>agressive, as has been requested.

Please do not be silly.  This is a stupid suggestion.  There is no valid 
compromise.  What was done in the past, adjusting to spam that opens with 
<!-- and closes with > is wrong - the rest of the e-mail (unless there is a 
--> somewhere) is a comment and will not be displayed.  Try displaying that 
in outlook, you know, the mail reader that many spam victims use?  Or the 
Netscape based AOL?

NO BROWSER, NOT ONE SINGLE ONE FOLLOWS THE STANDARDS!!!!!  THE TESTING 
PROVED THAT. THEREFORE IT WOULD BE STUPID FOR BOGOFILTER TO FOLLOW THE 
STANDARDS.

Doing it any way other than the way than ie does it is just plain dumb.

My comments (on how to process comments) were based on actually testing how 
IE and Netscape process comments.  If you do things any other way, you are 
simply allowing people to use comments to eat holes in bogofilter.  I will 
listen to someone who has actually tested IE and/or Netscape and come to a 
different conclusion about how they display comments. Or there might be a 
reason I have not heard yet. IE is also the "Microsoft HTML Renderer" so if 
you use the Microsoft Renderer from within, oh, Eudora, it will also 
process the comments the way I mentioned.

I also believe, by the way, that we should process tokens out of comments 
and use those, so that if someone has, for example, javascript routines 
that are common to the spam world, like obfuscators, we will recognize 
them.  The point is to move the comments out of words.  If they are not in 
words, you process them in place.

And I think that there should not be a flag to make that processing 
optional.  It is a useless complexity.  OK, we can have a flag if we want, 
but we should ignore it. :-)

But compromise for no reason is simply going to add complexity for no reason.

--
SPAM: Trademark for spiced, chopped ham manufactured by Hormel.
spam: Unsolicited, Bulk E-mail, where e-mail can be interpreted generally 
to mean electronic messages designed to be read by an individual, and it 
can include Usenet, SMS, AIM, etc.  But if it is not all three of 
Unsolicited, Bulk, and E-mail, it simply is not spam. Misusing the term 
plays into the hands of the spammers, since it causes confusion, and 
spammers thrive on  confusion. Spam is not speech, it is an action, like 
theft, or vandalism. If you were not confused, would you patronize a spammer?
Nick Simicich - njs at scifi.squawk.com - http://scifi.squawk.com/njs.html
Stop by and light up the world!



More information about the Bogofilter mailing list