lexer change

David Relson relson at osagesoftware.com
Tue Nov 4 15:25:36 CET 2003


Greetings,

My thanks to Boris "pi" Piwinger for the work he's put in on
bogofilter's lexer code.  He tested several sets of changes and some
have been applied to bogofilter.  Here are the details:

1 - remove unnecessary backslashes and reorder TOKEN... patterns.

    As they make _no_ difference to the generated code, most of these
changes have been applied.  I did not remove backslashes followed by
quotes or square brackets as they affect code colorizing in emacs.

2 - acceptance of digits at the beginning of tokens and acceptance of
numbers as tokens

    Rejected.  I don't see value in this change.

3 - acceptance of two character tokens.

    Rejected pending further evaluation.

4 - Removal of the {1,70} repetition count in the TOKEN pattern.

    Accepted.  This is the biggie!  

    With this change the generated lexer_v3.c file shrinks from 1.8M to
1.2M and a stripped bogofilter executable shrinks from 1.8M to 1.4M.

    AFAICT, this change doesn't change parsing results.  Time will tell
if there is an effect that hasn't yet been detected.

To summarize, the changes that help human readability and that reduce
program size have been accepted.  Changes that affect bogofilter results
have not been accepted.

I've attached a patch containing the applied changes.

David
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: patch.lexer_v3.l.1104.txt
URL: <http://www.bogofilter.org/pipermail/bogofilter/attachments/20031104/b0d97fa7/attachment.txt>


More information about the Bogofilter mailing list