Much simplified lexer

David Relson relson at osagesoftware.com
Wed Nov 12 17:25:10 CET 2003


On Wed, 12 Nov 2003 15:50:25 +0100
Boris 'pi' Piwinger <3.14 at logic.univie.ac.at> wrote:

> David Relson wrote:
> 
> >> > Insprired by our discussion, Tom, I changed the lexer to be
> >> > more in the fashion you describe. If you want to see if it
> >> > works for you, it is attached.
> >> 
> >> How does "size lexer_v3.o" change?
> > 
> > [relson at osage src]$ ll lexer_v3.l lexer_v3.pi.1112.l
> > -rw-r--r--    1 relson   relson      11861 Nov 12 08:11 lexer_v3.l
> > -rw-rw-r--    1 relson   relson      11627 Nov 12 08:12
> > lexer_v3.pi.1112.l
> > 
> > [relson at osage src]$ size lexer_v3.o lexer_v3.pi.1112.o
> >    text	   data	    bss	    dec	    hex	filename
> >   41899	      8	     60	  41967	   a3ef	lexer_v3.o
> >   51610	      8	  65640	 117258	  1ca0a	lexer_v3.pi.1112.o
> > 
> > While the source file is slightly smaller (approx 150 bytes), the .o
> > file is much larger (almost 3x)
> 
> I don't get it. It is really suprising to see this explode,
> since I removed rules or simplified them, some character
> classes slightly changed their size. If I take the last CVS
> version David sent over the list and my version, I get this:
> 
>    text    data     bss     dec     hex filename
>   42597      32   65632  108261   1a6e5 lexer_v3.cvs.o
>   50233      32   65632  115897   1c4b9 lexer_v3.new.o
> 
> pi

pi,

You've not shown the size of lexer_v3.l.  I can't explain your
lexer_v3.cv.o size difference (unless you're using a modified copy of
lexer_v3.l rather than the cvs copy).

I've attached my copy lexer_v3.l.  Since yesterday I've moved unused
definitions into comments and made HTMLTOKEN a primary definition
(rather than a reference to HTML_WI_COMMENT).

Below are version info for flex and my sizes for lexer_v3.l
lexer_v3.pi.1112.l and the associated .c and .o files:

[relson at osage src]$ flex --version
flex version 2.5.4

[relson at osage src]$ ll lexer_v3*.l
-rw-r--r--    1 relson   relson      11861 Nov 12 08:11 lexer_v3.l
-rw-rw-r--    1 relson   relson      11627 Nov 12 08:12
lexer_v3.pi.1112.l

[relson at osage src]$ ll lexer_v3*.c
-rw-r--r--    1 relson   relson     101336 Nov 12 08:28 lexer_v3.c
-rw-r--r--    1 relson   relson     118227 Nov 12 11:19
lexer_v3.pi.1112.c

[relson at osage src]$ ll lexer_v3*.o
-rw-r--r--    1 relson   relson      83704 Nov 12 08:28 lexer_v3.o
-rw-r--r--    1 relson   relson      93888 Nov 12 11:20
lexer_v3.pi.1112.o

[relson at osage src]$ size lexer_v3*.o
   text	   data	    bss	    dec	    hex	filename
  40773	      8	     60	  40841	   9f89	lexer_v3.o
  50541	      8	  65640	 116189	  1c5dd	lexer_v3.pi.1112.o

-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: lexer.prob.0825.txt
URL: <http://www.bogofilter.org/pipermail/bogofilter/attachments/20031112/ca797a2e/attachment.txt>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: lexer_v3.l
Type: application/octet-stream
Size: 11861 bytes
Desc: not available
URL: <http://www.bogofilter.org/pipermail/bogofilter/attachments/20031112/ca797a2e/attachment.obj>


More information about the Bogofilter mailing list