BASE64-encoded non-text MIME attachment being tokenized

Matt Garretson mattg at assembly.state.ny.us
Wed Oct 30 19:18:11 CET 2013


I occasionally see binary attachments that bogofilter (1.2.4) will
tokenize, resulting in tons of BASE64 string fragments in the token
stream.  Here's an example message (cleaned a bit):

   http://pastebin.com/FEvjRp9F       (line 38+)

and the output of bogolexer on it:

   http://pastebin.com/tabnXZSP       (line 94+)

Since the attachments are not of a text type, I would expect bogofilter
to skip over them.

Any ideas what is happening here?  Is there a debug flag that would help
narrow it down?

Thanks...



More information about the Bogofilter mailing list