unfolding header lines
Boris 'pi' Piwinger
3.14 at logic.univie.ac.at
Thu Sep 4 09:36:44 CEST 2003
David Relson <relson at osagesoftware.com> wrote:
>The initial implementation is done by C function get_unfolded_line()
>which does a bit of pre-reading of text to identify folded lines. The
>code converts the newlines encountered to spaces.
This is wrong. The newline must simply be deleted. Not that
it matters in our case;-) Actually, we could collapse all
multiple whitespace (blank, tab, newline) to one blank.
>It all works great -
>until the folded line far exceeds the prescribed max line length
>(RFC-2822, 998 characters).
As someone stated, this is the limit for a line which is
transmitted. I don't recall if there is any real limit.
>When the input buffer gets close to full
>(over 8k), the function returns and the remainder of the folded line
>isn't tagged.
If we limit to 8k, that will probably be good enough. But as
I suggested, if we don't want to do that, we can simply
split those long lines into multiple lines.
>It has been suggested that the flex grammar, i.e. lexer_v3.l, is the
>right place to handle the unfolding.
I don't think so. And decoding work could be done before
that, which is more reasonable, the lexer should just read
the message as you and I do in our readers.
>At the end of every message header and mime body part header
>is an empty line. Using the C code, the pattern for the line is "^[ \t]*$".
Hm, there must not be any whitespace in that line. Was that
found in the wild?
pi
More information about the bogofilter-dev
mailing list