segfault on rfc2047-like subject

Matthias Andree matthias.andree at gmx.de
Sat Oct 9 02:00:48 CEST 2004


David Relson <relson at osagesoftware.com> writes:

> I've started looking at text_decode.   So far, I see that the line
> causing the segfault should use the Z.

Wouldn't help. The word argument that text_decode works on is taken
straight from flex, and we must not touch anything outside of what flex
has provided us. We've trashed the boundary, although this wasn't
causing Clint's segfault, but might cause when string ends fall on flex
buffer boundaries.

> However variable "len" has a bad value, which is part of the problem.

Consequential fault. The problem is a bit more complex:

1. that qp_validate replaced the \n that was embedded in the encoded
   word by a NUL (fixed in CVS)

2. qp_validate happily accepted RFC-2047 encoded words with blanks
   (fixed in CVS)

3. text_decode operates with strstr when it should be using mem* or
   word_*. That fails with the embedded NUL inherited from 1.

> I'm investigating and will have a fix soon (with luck).

Stop right there :-)

The immediate fix is in CVS. Making qp.[ch] aware of the differences
between RFC-2045 and RFC-2047 and stopping qp_validate() from tampering
with the data resolved this problem - qp_validate will now figure that a
blank doesn't belong into an RFC-2047 encoded word and so text_decode
will leave it alone.

More text_decode fixes coming up.

Observing the whole issue a bit broader, word.[ch] pretty much resembles
the C++ standard library's "string" class. Using that would save us a
lot of hassle, but I'd suggest that C++ is a post-1.0 issue.

-- 
Matthias Andree

Encrypted mail welcome: my GnuPG key ID is 0x052E7D95 (PGP/MIME preferred)



More information about the bogofilter-dev mailing list