Stripsearch
Tom Anderson
tanderso at oac-design.com
Tue Jun 14 15:21:40 CEST 2005
----- Original Message -----
From: "Tom Anderson" <tanderso at oac-design.com>
> I'm using MIME::QuotedPrint. The only drawback is that now the email is
> being altered beyond simply inserting the tokens. Should I re-encode them
> before quitting?
This isn't working appropriately, as sometimes only particular MIME parts
are encoded. If I simply decode and re-encode the whole thing, then it
screws up MIME headers in these emails. Therefore, I've disabled the
quoted-printable decoding for now... those will just have to slip by for the
time being. Better to get some spam than make some of your ham unreadable
without going to the source. The only way I'll be able to do it
appropriately is to break up the MIME message into constituent parts and
decode each of them as needed. This will also allow me to skip non-text,
non-html parts such as binary attachments -- no need to try to parse those.
However, doing this will mean that we can no longer limit the script to the
body of the email, as I'll need to read the boundary string out of the
header. And if I'm doing that, then maybe I should just combine stripsearch
into spamitarium for a comprehensive prefilter that only parses the message
once and also saves from firing up the interpreter more than once. I have
no time to do so this week, but probably next week sometime I'll release the
new program. In the meantime, there's some bug fixes and quoted-printable
decoding turned off here:
http://orderamidchaos.com/bogofilter/stripsearch
Tom
More information about the Bogofilter
mailing list