obscured URL not being tokenized

David Relson relson at osagesoftware.com
Sat Dec 20 23:09:49 CET 2003


Hi Dan,

What version of bogofilter are you using?  I think you may be past due
for an update.  Here's a quote from file CHANGES-0.15:

	2003-10-19
	* Added decoding of percent escaped characters in URLs.

I've extracted your message, deleted some of the extra headers and the
ordinary text and named the result 'msg.obscured.URL.txt'.  I then ran
command "bogolexer -D -x l -p -vv < msg.obscured.URL.txt >
msg.obscured.URL.tmp".  Both the input and output files are attached. 
Let me know if you like the result :-)

I suggest that you upgrade to the current release 0.15.11.  While it
hasn't been promoted to "stable" status, it is eminently usable and
will, I expect, do fine for you.

David

Note:  It has been reported that 0.15.11 has a problem working with
separate wordlists, i.e. goodlist.db and spamlist.db.  I'm waiting for
more info so I can reproduce the problem and fix it.  If you're still
using separate wordlists, use bogoupgrade from 0.15.10 to create a
combined wordlist.  After that you'll be able to use 0.15.11 without any
problem.
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: msg.obscured.URL.txt
URL: <http://www.bogofilter.org/pipermail/bogofilter/attachments/20031220/98a491e4/attachment.txt>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: msg.obscured.URL.tmp
Type: application/octet-stream
Size: 2198 bytes
Desc: not available
URL: <http://www.bogofilter.org/pipermail/bogofilter/attachments/20031220/98a491e4/attachment.obj>


More information about the Bogofilter mailing list