tanderso at oac-design.com
Mon Sep 13 15:01:09 EDT 2004
From: "Pavel Kankovsky" <peak at argo.troja.mff.cuni.cz>
> On Tue, 7 Sep 2004, Tom Anderson wrote:
>> Careful about /dev/null'ing JScript.Encode... it's a Microsoft
>> proprietary technology, [...]
> And this is a good and sufficient reason to stop it before it spreads
> like a contagious disease.
This is not for bogofilter to decide. Unlike Microsoft products, bogofilter
should be adaptable, and not be used for political reasons. The fact is
that Outlook and Outlook Express are ubiquitous. Not allowing for their
quirks will make bogofilter less useful, not Outlook. It would stop
bogofilter's spread, not the other way around.
> Moreover, I do not think anyone has a legitimate reason to obfuscate
> (obfuscation is not encryption) email contents. Either the recipient is
> intended to see it, then there is no point in obfuscation, or the
> recipient is not intended to see, and then it should not be sent in
> the first place.
> JScript.Encode is good for spammers and malware. And perhaps for MS with
> its delusions of world domination. It is bad for anyone else.
I agree completely, but Microsoft is not deluded about the reality of their
domination. They always have and will continue to "embrace and extend"
obfuscation standard practice (because people actually think it secures
their code!), then they will do so by leveraging their monopoly. Nobody is
going to use bogofilter or any other software which doesn't work correctly
with the email clients used by 90% of people.
> Well, JS is just another level of obfuscation. There is no reliable way to
> determine what the real visible contents of "JS-enabled HTML" is short of
> running the code in question.
That might be an interesting prefilter to bogofilter... a program that
bogofilter for scoring. I'm sure such a program could make liberal use of
the Mozilla engine to do so.
> same way (more or less). It should recognize their presence and be able to
> recognize them as strong spam indicators.
This is clearly not a rule. Plenty of unsophisticated end-users take
advantage of their email clients' built-in abilities to send dynamic
entirely on your training corpus. The most accurate way to handle it would
be to unobfuscate it as much as possible, and then score it normally, or
else ignore it completely. Biasing script as spam is arbitrary and
incorrect for most people.
More information about the Bogofilter