BASE64-encoded non-text MIME attachment being tokenized
Pavel Kankovsky
peak at argo.troja.mff.cuni.cz
Mon Dec 2 11:12:17 CET 2013
On Wed, 30 Oct 2013, Matt Garretson wrote:
> I occasionally see binary attachments that bogofilter (1.2.4) will
> tokenize, resulting in tons of BASE64 string fragments in the token
> stream. [...]
> Any ideas what is happening here?
I think it's the at sign (@) in the MIME boundary. Bogofilter uses the
following patterns to detect boundaries and they fail to match when
@ appears in the boundary:
BCHARSNOSPC [[:alnum:]()+_,-./:=?#\']
BCHARS [[:alnum:]()+_,-./:=?#\' ]
MIME_BOUNDARY {BCHARS}*{BCHARSNOSPC}
A boundary containing @ is not RFC 2046-compliant but such nonstandard
strings appear in the wild (and there is already one nonstandard char,
namely #, recognized by Bogofilter).
Speaking of nonstandard characters appearing in MIME boundaries, I have
seen {, } and % as well.
--
Pavel Kankovsky aka Peak / Jeremiah 9:21 \
"For death is come up into our MS Windows(tm)..." \ 21st century edition /
More information about the Bogofilter
mailing list