unfolding header lines
michael at optusnet.com.au
michael at optusnet.com.au
Thu Sep 4 05:10:19 CEST 2003
David Relson <relson at osagesoftware.com> writes:
> Greetings all,
>
> As you all know, bogofilter now (as of 0.15.1) knows about unfolding
> header lines. This is useful as it allows tagging of all the tokens of
> a multi-line To:, Subject:, From:, or Return-Path: header line.
[...] Using the C code, the pattern for the line is "^[
> \t]*$". When the unfolding work shifts into lexer_v3.l, the pattern
> becomes "\n[ \t]*\n" and this causes trouble. The lexer is in header
> mode as it reads the empty line and as it pre-reads the line _after_
> that. Being in header mode, base64 and qp decoding don't get applied.
> End of story :-(
No, you've got the unfolding regex wrong. It's not '^[ \t]*', it's
'^[ \t]+'. I.e. one or more, not zero or more.
And one or more doesn't match an empty line. (at least, it shouldn't.
I didn't think the RFC allowed whitespace on an empty line?)
So just add
<INTIAL>\n[ \t] ; /* unfold lines */
to the end of the INITAL section in the lexer. Or am I mis-understanding
the problem?? (note that you don't need either a + or a * after the
[] as the lexer will eat any additional whitespace normally).
Michael.
More information about the bogofilter-dev
mailing list