the recent ^From issues.

Mon Jan 27 13:47:36 CET 2003

Matthias Andree <matthias.andree at gmx.de> writes:

> My original idea was to have one lexer (lexer_head.l) to gather the
> structure, and pass decoded stuff down to the "token extracting"
> lexers. Given that "^From " lines will never be encoded, this is
> clean.
>
> Any rules that are aware of the message or MIME structure in
> lexer_text_{plain,html}.l are clearly misplaced under these assumptions.

To refine these thoughts, and after looking into token.c, the LEXER
state switching is wrong. We need to always run lexer_lex() first, and
if and only if that is in a "body" mode, decode the lines it gathered
down to the according text_*_lex() functions. This will allow us to put
the whole decoding, structure detection and so on into lexer_head.l and
make lexer_text_*.l simple and robust.

Any objections?

-- 
Matthias Andree