[cvs] bogofilter/src lexer_v3.l,1.158,1.159

Matthias Andree matthias.andree at gmx.de
Mon Jun 27 00:31:47 CEST 2005


David Relson <relson at osagesoftware.com> writes:

> Processing RFC2047 encoded words without yy_unput and recursion is
> tricky.

Only because we're used to the lexer recognizing these words.

> -- another task for the lexer.  Handling either of these tasks without
> using lexer_v3.l adds complexity to the parsing.

and removes the recursion as the other parser has no unput functionality :)

> Permit me to play devil's advocate here.  Does it matter if the header
> lines are processed more than once for RFC2047 tokens?  Is there any
> measurable effect on bogofilter's classification abilities?  If the
> answer is no, perhaps we should document this as an unimportant
> limitation.  What think you?

It doesn't matter too much, but it repairs broken headers that would
otherwise emit tokens (usually from the Subject line) we could score on.

-- 
Matthias Andree



More information about the bogofilter-dev mailing list