lexer change
Boris 'pi' Piwinger
3.14 at logic.univie.ac.at
Tue Nov 4 15:44:29 CET 2003
David Relson wrote:
> 1 - remove unnecessary backslashes and reorder TOKEN... patterns.
>
> As they make _no_ difference to the generated code, most of these
> changes have been applied. I did not remove backslashes followed by
> quotes or square brackets as they affect code colorizing in emacs.
I guess I can live with that:-)) Also with the ordering of
items.
> 2 - acceptance of digits at the beginning of tokens and acceptance of
> numbers as tokens
>
> Rejected. I don't see value in this change.
>
> 3 - acceptance of two character tokens.
>
> Rejected pending further evaluation.
The was the idea. People should test it. Attached is a patch
for those who want to play to be applied *after* David's patch.
> 4 - Removal of the {1,70} repetition count in the TOKEN pattern.
>
> Accepted. This is the biggie!
>
> With this change the generated lexer_v3.c file shrinks from 1.8M to
> 1.2M and a stripped bogofilter executable shrinks from 1.8M to 1.4M.
Great, I did not expect my changes would be that useful:-))
pi
-------------- next part --------------
--- lexer_v3.l.bak Tue Nov 4 14:40:35 2003
+++ lexer_v3.l Tue Nov 4 14:42:12 2003
@@ -133,8 +133,8 @@
NUM_NUM \ [0-9]+\ [0-9]+
MSG_COUNT ^\"\.MSG_COUNT\"
-TOKENFRONT [^[:blank:][:cntrl:][:digit:][:punct:]]
-TOKENMID [^[:blank:]<>;=():&%$#@+|/\\{}^\"?*,[:cntrl:][\]]+
+TOKENFRONT [^[:blank:][:cntrl:][:punct:]]
+TOKENMID [^[:blank:]<>;=():&%$#@+|/\\{}^\"?*,[:cntrl:][\]]*
BOGOLEX_TOKEN [^[:blank:]<>; &% @ |/\\{}^\" *,[:cntrl:][\]]+
TOKENBACK [^[:blank:]<>;=():&%$#@+|/\\{}^\"?*,[:cntrl:][\]._+-]
More information about the bogofilter
mailing list