[patch] small lexer changes

Matthias Andree matthias.andree at gmx.de
Thu Oct 10 02:01:02 CEST 2002


-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Wed, 09 Oct 2002, David Relson wrote:

> >I benchmarked this, I copied my linux-kernel mailing list mbox file, and
> >ran lexertest <lexertest.data >/dev/null. I ran five tests and averaged
> >the results. (Duron 700 with 320 MB RAM).
> 
> The thought occurs to me that lexertest uses printf() to output all the 
> tokens.  For our purposes, a more interesting comparison might be to 
> comment out the printf() calls.

You're right, but it turns out to only change the relative slowdown, not
the absolute one ;-)

See the current lexertest.c, it has a -q option. It turned out
printf took some time, but not so much as to ruin the data.

%option     user time     text size time size
full:       6.04 ± 0.06 s   1481727  +0%  -0% (5 samples)
full ecs:   6.33 ± 0.10 s    340843  +5% -77% (5 samples)
fast:       6.79 ± 0.06 s   2836415 +12% +91% (3 samples)
ecs:        7.41 ± 0.08 s    105247 +23% -93% (5 samples)

(I only list the options I played with, the others like align are still
intact, and I don't know how good "bash time" data is, at most 1/100 s.)

So I'd suggest "full ecs" or "ecs". "fast" is inacceptable in wasting
space AND time. Probably a tribute to cache effects, but I'm too lazy to
run cachegrind now, I'm not about to tune flex(1).

- -- 
Matthias Andree
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.7 (GNU/Linux)

iD8DBQE9pMM+FmbjPHp/pcMRAurWAJ0ZIj8KPHHw2epua9s7aybW+LFMwgCfb1V7
0miUE8FqLBS+nILaQdVQcw0=
=5tpb
-----END PGP SIGNATURE-----



More information about the bogofilter-dev mailing list