Radical lexers

Boris 'pi' Piwinger 3.14 at logic.univie.ac.at
Tue Jan 20 15:40:46 CET 2004


Boris 'pi' Piwinger wrote:

> This is a very short test only. I compare my version (a) of
> the lexer (http://piology.org/bogofilter/lexer_v3.l) with a
> much stricter version of it (b). TOKEN will effectively be
> of the form
> [^[:blank:][:cntrl:]<>;&%@|/\\{}^"*,[\]=()+?:#$._!'`~-]+

This has been more than a month ago. When 0.16 came out I
started using this lexer (see http://piology.org/bogofilter/
for the radical lexer) into production. I have to say that I
am totally satisfied.

So the main difference is that TOKEN is much simpler. In
effect, tokens will in average be shorter, since they are
split up here where they are not with the standard lexer
(my-lexer will be one token in the latter, but two in the
former).

Another side effect is that some rules become simpler by the
shorter TOKEN definition, but that should not change the
parsing.

Also some special rules are simply dropped (the $-rule, the
DOCTYPE switch)

pi




More information about the Bogofilter mailing list