segfault on rfc2047-like subject

Matthias Andree matthias.andree at gmx.de
Sat Oct 9 02:06:01 CEST 2004


David Relson <relson at osagesoftware.com> writes:

> On Sat, 09 Oct 2004 01:27:29 +0200
> Matthias Andree wrote:
>
>> Clint Adams <schizo at debian.org> writes:
>> 
>> > Subject: [Broken]
>> > =?ISO-8859-1?Q?Re=3A_=5BBroken=5D_=3D=3FISO-8859-1=3FQ=3F=3D5B?=
>> >  =?ISO-8859-1?Q?Broken=3D5DBlah=3D20Foo=3DE4=3D20Bar=3D20Blah
>> >  _?= =?ISO-8859-1?Q?Foo=3D3D28=3D5F=3F=3D_Bar=5F=5F=3F=3D_t=E4Blah?=
>> >  =?ISO-8859-1?Q?Foo=E4t=29?=
>> 
>> Gee. qp_validate was tampering with the string it was passed, turning
>> the LF into a NUL and replacing underscores by blanks (that surely
>> doesn't belong into a "validate" function.)
>
> Yep.  Sounds like a fix is needed.  Do you want to do it or shall I?

[tick] done.

> Looks like new-lines aren't allowed in the middle of a token.

That's a hard one, because only real line breaks may be represented as
line breaks, all other CR, LF or CRLF must be represented as =0D, =0A,
=0D=0A, potentially with "soft line breaks" that are line breaks in the
encoded mail to support long lines but are naught in the actual encoded
file.

We don't need to care too much on the decoding side though, but I'm
wondering how robust we're handling extraneous CR characters. The
"Minolta" group at YahooGroups regularly sees improperly encoded mails

which look=\r
like this=\r
paragraph

which means that even after the MDA has reduced the transport CRCRLF to
CRLF, we'll have CR. I don't mind too much.

>> The NUL then choked text_decode which was supposed to treat word_t
>> rather than C strings.
>
> AFAICT, the NUL causes strstr() to not find the end of the token.  The
> NULL returned by strstr() then causes "uint size = (uint) (txt - beg)"
> to give a bogus result.  Bah, humbug!

Exactly.

>
>> I think we need to tell the quoted-printable decoder whether it is
>> decoding RFC-2047 (transform underscore to blank; blank and tab
>> illegal) or RFC-2045 (pass underscore, blank and tab allowed).
>
> Do you want to tackle RFC-2045 vs RFC-2047???

Done.

I'll also stab at finding the ?= with something that is not str*(), in
other words, nuking all the str*() from text_decode now, but may need to
defer the other occurrences to a later date. 0200 local time...

-- 
Matthias Andree

Encrypted mail welcome: my GnuPG key ID is 0x052E7D95 (PGP/MIME preferred)



More information about the bogofilter-dev mailing list