javascript/spam [was: Several Subjects]
David Relson
relson at osagesoftware.com
Thu Mar 13 05:41:13 CET 2003
At 11:31 PM 3/12/03, Brian Minton wrote:
>I think your message may have gotten cut off...
It certainly did! I think NT was heading south at the time, so I saved a
copy of the message (fortunately). So let's try again... I've CC'd the
list because I found your message to be the first of a new and interesting
kind.
David
**** Full message follows ****
Hi Brian,
Yep, your javascript message was spam3. The structure of the message is:
<html><head>
<SCRIPT LANGUAGE="javascript">
<!--
... [ snipped javascript ] ...
//-->
</SCRIPT>
</HEAD>
<BODY onload="IsP();"></BODY>
</HTML>
From one point of view it's pretty cool - there's nothing there except the
javascript. Interpreting the message, the whole BODY is the call to
function Isp(). The function checks for MSIE 6 and takes you to
http://www.commcross.net, which is a rudimentary and incomplete site.
Given that bogofilter current currently discards the innards of html
comments and html tags, the message reduces to its header. Bogofilter
scored that at 0.133793 which is pretty hammish. Note: with the suggested
values of ham_cutoff and spam_cutoff of 0.10 and 0.95 (for tri-state
Robinson-Fisher), this message would classify as "Unsure".
The message points to the value of keeping and scoring the innards of html.
I guess it's time to add the various options for allowing keeping/scoring
innards of html comments, of valid html tags, and of broken html tags.
Now, have I covered all the questions for this message or have I left
something out?
David
More information about the Bogofilter
mailing list