obscured URL not being tokenized
David Relson
relson at osagesoftware.com
Sat Dec 20 23:09:49 CET 2003
Hi Dan,
What version of bogofilter are you using? I think you may be past due
for an update. Here's a quote from file CHANGES-0.15:
2003-10-19
* Added decoding of percent escaped characters in URLs.
I've extracted your message, deleted some of the extra headers and the
ordinary text and named the result 'msg.obscured.URL.txt'. I then ran
command "bogolexer -D -x l -p -vv < msg.obscured.URL.txt >
msg.obscured.URL.tmp". Both the input and output files are attached.
Let me know if you like the result :-)
I suggest that you upgrade to the current release 0.15.11. While it
hasn't been promoted to "stable" status, it is eminently usable and
will, I expect, do fine for you.
David
Note: It has been reported that 0.15.11 has a problem working with
separate wordlists, i.e. goodlist.db and spamlist.db. I'm waiting for
more info so I can reproduce the problem and fix it. If you're still
using separate wordlists, use bogoupgrade from 0.15.10 to create a
combined wordlist. After that you'll be able to use 0.15.11 without any
problem.
-------------- next part --------------
>From
Return-Path: <umsom at yahoo.com>
Message-ID: <3821481071892992 at p508F3493.dip.t-dialin.net>
From: "catie" <umsom at yahoo.com>
To: "dvsing at sonicspike.net" <dvsing at sonicspike.net>
Subject: FWD: Barn Lovin Bimbos hugdt
Date: Fri, 19 Dec 2003 21:03:12 -0700
MIME-Version: 1.0
Content-Type: text/html; charset="iso-8859-1"
Content-Transfer-Encoding: 8bit
<html>
<body bgcolor="#FFFFFF">
<a href="http://%322%31.2%332.%316%30.1%305/%7a/s%69l%76e%72/f%61r%6d/i%6ed%65x.%68t%6dl">
<img border="0" src="http://%322%31.2%332.%316%30.1%305/%7a/s%69l%76e%72/f%61r%6d/e%6et.%6ap%67" width="500" height="300"></a></p>
<p align="left"><FONT face="Verdana, Arial, Helvetica, sans-serif" size=1><B>
<font color="#FF0000">
<span style="background-color: #FFFFFF">Get zero more of these starting tomorrow:
<a target="_blank" href="http://%322%31.2%332.%316%30.1%305/%7a/d%6fn%651.%68t%6dl">
<font color="#000000">CLICK </font></a> </font> </span> <FONT color=#ffffcc>
<a target="_blank" href="http://%322%31.2%332.%316%30.1%305/%7a/d%6fn%651.%68t%6dl">
<FONT color=#000000><span style="background-color: #FFFFFF">HERE </span> </FONT></a></FONT></B></FONT></p>
</body>
</html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: msg.obscured.URL.tmp
Type: application/octet-stream
Size: 2198 bytes
Desc: not available
URL: <https://www.bogofilter.org/pipermail/bogofilter/attachments/20031220/98a491e4/attachment.obj>
More information about the bogofilter
mailing list