More strangeness in lexer?

michael at optusnet.com.au michael at optusnet.com.au
Thu Aug 7 02:38:21 CEST 2003


Why does the pattern

^From\ 

call 'is_from' which checks for the exact same match?

And why does lexer.c:lgetsl() call 'is_from'?

The problem here is that the lexer is allowed to
call yy_get_new_line() during look-ahead. So if
the lexer looks ahead, token_init() can wind
up being called before the tokens in question
can be parsed. bad.

It would be better to have token.c:get_token()
call token_init() when it sees a 'FROM' token
returned, yes?


Note that either way, the pattern is broken. 
The '^From ' MUST follow a blank line for
it to be a seperator.

I.e. "fred\nFrom hi" is NOT a seperator, but
"fred\n\nFrom hi" is. 

(More properly, emails stored in an mbox
file must end with a blank line, but lets
not split hairs. :)

Michael.




More information about the bogofilter-dev mailing list