Much simplified lexer
Boris 'pi' Piwinger
3.14 at logic.univie.ac.at
Thu Nov 13 15:47:37 CET 2003
David Relson wrote:
>> I don't get it. It is really suprising to see this explode,
>> since I removed rules or simplified them, some character
>> classes slightly changed their size. If I take the last CVS
>> version David sent over the list and my version, I get this:
>>
>> text data bss dec hex filename
>> 42597 32 65632 108261 1a6e5 lexer_v3.cvs.o
>> 50233 32 65632 115897 1c4b9 lexer_v3.new.o
> I've attached my copy lexer_v3.l. Since yesterday I've moved unused
> definitions into comments and made HTMLTOKEN a primary definition
> (rather than a reference to HTML_WI_COMMENT).
Right, looks great, also the unified use of character
classes and a lot of those \ is suggested. But none of those
make a difference.
Here is the change which does make the difference. You
changed this line:
<INITIAL>(ESMTP|SMTP)+/[ \t\n]+id\ {ID}
to that line:
<INITIAL>(ESMTP|SMTP)+
I don't understand what this is good for. In the original
expression the / seems to be wrong, maybe the space behind
"id" should also be any kind of whitespace. But why
completely remove it?
Anyhow, wouldn't the following be nicer:
<INITIAL>(E?SMTP)+
And why the +? I only see it in the form "with ESMTP id
PAA16337" etc., no repeated SMTP or ESMTP. So I would have
assumed that version:
<INITIAL>E?SMTP{WHITESPACE}+{WHITESPACE}id{ID}
pi
More information about the Bogofilter
mailing list