[cvs] bogofilter/src lexer_v3.l,1.158,1.159
Matthias Andree
matthias.andree at gmx.de
Mon Jun 27 00:31:47 CEST 2005
David Relson <relson at osagesoftware.com> writes:
> Processing RFC2047 encoded words without yy_unput and recursion is
> tricky.
Only because we're used to the lexer recognizing these words.
> -- another task for the lexer. Handling either of these tasks without
> using lexer_v3.l adds complexity to the parsing.
and removes the recursion as the other parser has no unput functionality :)
> Permit me to play devil's advocate here. Does it matter if the header
> lines are processed more than once for RFC2047 tokens? Is there any
> measurable effect on bogofilter's classification abilities? If the
> answer is no, perhaps we should document this as an unimportant
> limitation. What think you?
It doesn't matter too much, but it repairs broken headers that would
otherwise emit tokens (usually from the Subject line) we could score on.
--
Matthias Andree
More information about the bogofilter-dev
mailing list