obscured URL not being tokenized

David Relson relson at osagesoftware.com
Sat Dec 20 23:09:49 CET 2003


Hi Dan,

What version of bogofilter are you using?  I think you may be past due
for an update.  Here's a quote from file CHANGES-0.15:

	2003-10-19
	* Added decoding of percent escaped characters in URLs.

I've extracted your message, deleted some of the extra headers and the
ordinary text and named the result 'msg.obscured.URL.txt'.  I then ran
command "bogolexer -D -x l -p -vv < msg.obscured.URL.txt >
msg.obscured.URL.tmp".  Both the input and output files are attached. 
Let me know if you like the result :-)

I suggest that you upgrade to the current release 0.15.11.  While it
hasn't been promoted to "stable" status, it is eminently usable and
will, I expect, do fine for you.

David

Note:  It has been reported that 0.15.11 has a problem working with
separate wordlists, i.e. goodlist.db and spamlist.db.  I'm waiting for
more info so I can reproduce the problem and fix it.  If you're still
using separate wordlists, use bogoupgrade from 0.15.10 to create a
combined wordlist.  After that you'll be able to use 0.15.11 without any
problem.
-------------- next part --------------
>From 
Return-Path: <umsom at yahoo.com>
Message-ID: <3821481071892992 at p508F3493.dip.t-dialin.net>
From: "catie" <umsom at yahoo.com>
To: "dvsing at sonicspike.net" <dvsing at sonicspike.net>
Subject: FWD:    Barn Lovin Bimbos             hugdt
Date: Fri, 19 Dec 2003 21:03:12 -0700
MIME-Version: 1.0
Content-Type: text/html; charset="iso-8859-1"
Content-Transfer-Encoding: 8bit

<html>
<body bgcolor="#FFFFFF">

<a href="http://%322%31.2%332.%316%30.1%305/%7a/s%69l%76e%72/f%61r%6d/i%6ed%65x.%68t%6dl">
<img border="0" src="http://%322%31.2%332.%316%30.1%305/%7a/s%69l%76e%72/f%61r%6d/e%6et.%6ap%67" width="500" height="300"></a></p>

<p align="left"><FONT face="Verdana, Arial, Helvetica, sans-serif" size=1><B>
<font color="#FF0000">
<span style="background-color: #FFFFFF">Get zero more of these starting tomorrow:
<a target="_blank" href="http://%322%31.2%332.%316%30.1%305/%7a/d%6fn%651.%68t%6dl">
<font color="#000000">CLICK </font></a> </font> </span> <FONT color=#ffffcc>
<a target="_blank" href="http://%322%31.2%332.%316%30.1%305/%7a/d%6fn%651.%68t%6dl">
<FONT color=#000000><span style="background-color: #FFFFFF">HERE </span> </FONT></a></FONT></B></FONT></p>

</body>
</html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: msg.obscured.URL.tmp
Type: application/octet-stream
Size: 2198 bytes
Desc: not available
URL: <https://www.bogofilter.org/pipermail/bogofilter/attachments/20031220/98a491e4/attachment.obj>


More information about the bogofilter mailing list