radical lexer

David Relson relson at osagesoftware.com
Sun Nov 26 20:26:22 CET 2006


On Sun, 26 Nov 2006 20:07:57 +0100
Boris 'pi' Piwinger wrote:

> David Relson <relson at osagesoftware.com> wrote:
> 
> >A quick comparison of bogofilter's lexer_v3.l and your radical lexer
> >was interesting, particularly the following line:
> >
> >TOKENBORDER [^[:blank:][:cntrl:]<>;&%@|/\\{}^"*,[\]=()+?:#$._!'`~-]
> 
> Also we differ on BOGOLEX_TOKEN, where I don't allow =()+
> which you do allow. I really don't know if this is
> important. Can you explain why?

Good question!  BOGOLEX_TOKEN should allow the characters permitted in
a TOKEN _plus_ colon and asterisk (which are used for header tokens and
multi-tokens).

Another patch attached...
-------------- next part --------------
Index: lexer_v3.l
===================================================================
RCS file: /cvsroot/bogofilter/bogofilter/src/lexer_v3.l,v
retrieving revision 1.172
diff -u -r1.172 lexer_v3.l
--- lexer_v3.l	26 Nov 2006 17:56:29 -0000	1.172
+++ lexer_v3.l	26 Nov 2006 19:23:19 -0000
@@ -147,7 +147,7 @@
 
 TOKENFRONT	[^[:blank:][:cntrl:][:digit:][:punct:]]
 TOKENMID	[^[:blank:][:cntrl:]<>;=():&%$#@+|/\\{}^\"?*,\[\]]*
-BOGOLEX_TOKEN	[^[:blank:][:cntrl:]<>;    &%  @ |/\\{}^\" *,\[\]]+
+BOGOLEX_TOKEN	[^[:blank:][:cntrl:]<>;    &%  @ |/\\{}^\"  ,\[\]]*
 TOKENBACK	[^[:blank:][:cntrl:]<>;=():&%$#@+|/\\{}^\"?*,\[\]._~\'\`\-]
 
 TOKEN		{TOKENFRONT}({TOKENMID}{TOKENBACK})?


More information about the bogofilter mailing list