Stripsearch

Tom Anderson tanderso at oac-design.com
Tue Jun 14 15:21:40 CEST 2005


----- Original Message ----- 
From: "Tom Anderson" <tanderso at oac-design.com>
> I'm using MIME::QuotedPrint.  The only drawback is that now the email is 
> being altered beyond simply inserting the tokens.  Should I re-encode them 
> before quitting?

This isn't working appropriately, as sometimes only particular MIME parts 
are encoded.  If I simply decode and re-encode the whole thing, then it 
screws up MIME headers in these emails.  Therefore, I've disabled the 
quoted-printable decoding for now... those will just have to slip by for the 
time being.  Better to get some spam than make some of your ham unreadable 
without going to the source.  The only way I'll be able to do it 
appropriately is to break up the MIME message into constituent parts and 
decode each of them as needed.  This will also allow me to skip non-text, 
non-html parts such as binary attachments -- no need to try to parse those. 
However, doing this will mean that we can no longer limit the script to the 
body of the email, as I'll need to read the boundary string out of the 
header.  And if I'm doing that, then maybe I should just combine stripsearch 
into spamitarium for a comprehensive prefilter that only parses the message 
once and also saves from firing up the interpreter more than once.  I have 
no time to do so this week, but probably next week sometime I'll release the 
new program.  In the meantime, there's some bug fixes and quoted-printable 
decoding turned off here:

http://orderamidchaos.com/bogofilter/stripsearch

Tom




More information about the Bogofilter mailing list