nonconformant RFC-2047 (was: Re: bogofilter-SA-2004-01)
Pavel Kankovsky
peak at argo.troja.mff.cuni.cz
Mon Nov 8 13:01:57 CET 2004
On Fri, 5 Nov 2004, Matthias Andree wrote:
> An encoded word as per RFC-2047 does not contain line feed characters,
> so we should not accept or attempt to decode them.
It depends on how popular MUAs interpret them. If they interpret them as
if they were ok, then spammers might abuse that misfeature to hide text
from Bogofilter. Rather than
Subject: spammyword1 spammyword2 spammyword3
they could write:
Subject: =?iso-8859-1?q?[CR]=12=34=56=...=ab=cd=ef?=
and Bf (using its standard lexer) would see no tokens. On the other hand,
there are other methods to hide spammy tokens (e.g. text interleaved with
spaces, text send as an image), and some of them are already quite popular
today.
--Pavel Kankovsky aka Peak [ Boycott Microsoft--http://www.vcnet.com/bms ]
"Resistance is futile. Open your source code and prepare for assimilation."
More information about the bogofilter-dev
mailing list