SPAN style="DISPLAY: none" spams

Tony L. Svanstrom tony at moon.pp.se
Thu Jul 28 01:34:06 CEST 2005


On Wed, 27 Jul 2005 the voices made Tom Anderson write:

TA> to me, deleting all HTML is the ugliest possible hack.

 Come on, I almost want to call troll on that one...

 If you were to get a (snailmail) envelope with 1 page containing all the data
that you want, and then, in the same envelope, the same data but on 10 pages
and in a format making it harder for you to "access" it (read, understand, work
with etc); then you wouldn't think twice about discarding those unwanted 10
pages.

 To me it's the same with e-mail; I get the data in a bloated format I don't
like, so I remove the stuff I don't want; that's about as much an ugly hack as
removing unwanted e-mails (like spam).

 I also rewrite e-mails to make them more like I want them to be; I remove
spaminess information added by mailservers which I don't control, I rewrite the
subjectline when it contains translated "Re: "-prefixes, I remove spam/virus-
prefixes added to the subjectline by some servers, I decode some attachments
(like base64-encoded textfiles), I remove "signatures" which are ads added by
some servers and so on...

TA> Parsing HTML may not be such a bad idea, but the beauty of statistical
TA> filtering is that you don't need to for the most part.  HTML does not cause
TA> any problems for my filter.

 Depending on how the "statistical filtering" works it's not a huge problem,
but if you can trick the filter by adding "invisible" contents then it is a
problem; and the person starting this thread/subject had exactly that problem,
so it is a problem with BF (at least for some).

TA> Yep, those Amish really know how to live life to the fullest...

 I bet they're happier than a lot of us, but... If you seriously compare
wanting my text- messages to be just text-messages with living without
electricity (which a lot of amish use, in a limited way) then you're just an
[insert friendlier version of the word "idiot" here].

 Staying with the amish analogy; the amish don't stay away from everything
which wasn't invented after a certain year, they just pick what they want to
use and how they want to use it.
 Simply because something is newer it doesn't mean that it's better; or would
you prefer that e-mail follows the evolution of the www, so that you soon will
get a lot of e-mail which requires flash, java and javascript only available
in IE?

TA> > Sooner or later, of course, we'll see something like we do today with
TA> > javascripts great at poping up ads even though a lot of people are using
TA> > popup
TA> > blockers
TA>
TA> Mozilla works fine for me.  I don't get any popups.

 Bad news for ya here, you're not the whole world.

TA> Spam is dying.  In another 3-5 years, it'll be gone altogether.

 I've heard that before.

TA> Sooner if any of the M$ propaganda about Longhorn/Vista is true (BTW, buy
TA> MSFT while it's cheap).

 You're trusting Microsoft to save the Internet from some of it's problems... I
think it's best said in the Fish Licence-sketch (Monty Python): "You are a
looney."

TA> Deleting HTML is not the answer.

 Sure it is, it might not be your answer though...

TA> looking up spamvertized URLs in block lists.

 How? What tools/lists are you using for that?


	/Tony
-- 
        /\___/\                                          /\___/\
        \_@ @_/                                          \_@ @_/
   .--oOO-(_)-OOo--------------------------------------oOO-(_)-OOo--.
   |  perl -e'print$_{$_} for sort%_=`lynx -dump svanstrom.com/t`'  |
   `---ôôô---ôôô----------------------------------------ôôô---ôôô---´
       \O/   \O/        ©1998-2005 svanstrom.com        \O/   \O/




More information about the Bogofilter mailing list