bogofilter -p mishandling non-ascii chars?

Barry Gould BarryGould at PennySaverUSA.net
Sat Oct 26 00:06:47 CEST 2002


Hi,
I think

I just received some spam which got tagged with x-spam-status=no by 
BogoFilter 0.7.3.

After messing around with it, trying to feed it back in with -S, etc, I 
realized there were some non-ascii characters at the bottom of the message 
which are give me (and bash) grief. Every time I tried to paste the body 
into my terminal, it would suspend bogofilter (as if I had hit CTRL-Z).

I've noticed quite a few spams this week with garbage at the bottom of the 
message.
I suspect the garbage is being added to confuse and/or crash spam filters. 
It's certainly working at confusing me and bash.

Turns out the message was sent as base64, so it did make it through 
bogofilter initially.

However, If I take the decoded (by Eudora) version of the message, save it 
in a text file, and feed it to bogofilter -p, it gets corrupted.

I think this _may_ be a problem.
e.g.
# bogofilter -p <spam_mess.txt  > spam_mess.txt2
# diff spam_mess.txt spam_mess.txt2
43a44
 > </x-html>

I am worried about this as I was thinking about doing some sort of base64 
decoding before passing to bogofilter -p. I don't know if this is a good 
idea or not, but obviously I am getting base64 spam and bogofilter isn't 
catching it.

Currently bogofilter has the "base64" token at:
base64 -> 0.583291

I don't know if this is because I've received a lot of real base64 ham, or 
if I've recieved a lot of base64 spam that wasn't recognized and was 
therefore stored with bogofilter -h. (I'm using a version of the the recipe 
that tests and stores everything.)

I am attaching both the original base64 version and the text version from 
Eudora in a ZIP file in case anyone wants to look at them.

Thanks,
Barry

Barry Gould
Site Administrator, Harte-Hanks Shoppers
714-577-4313 or 877-244-4421
http://www.pennysaverusa.com
http://www.theflyer.com
-------------- next part --------------
A non-text attachment was scrubbed...
Name: base64spam.zip
Type: application/zip
Size: 4101 bytes
Desc: not available
URL: <http://www.bogofilter.org/pipermail/bogofilter/attachments/20021025/95222e20/attachment.zip>


More information about the Bogofilter mailing list