flex speed

michael at optusnet.com.au michael at optusnet.com.au
Tue Aug 5 01:21:55 CEST 2003


Bob Friesenhahn <bfriesen at simple.dallas.tx.us> writes:

> This naturally brings up the subject: Has anyone profiled bogofilter
> using a profiling tool like 'gcc -pg' and 'gprof'?

Yes. You'll be unsuprised to know that the lexer accounts
for more than 90% of the user CPU time. :)

Seriously, all the work in bogofilter is in the tokenizing
driven by the very complex token rules. If you want it to
run faster (and I do!) then the only way out is to hand-build
the lexer.

The problem here is that hand-build lexers, while very fast,
are normally very hard to change.

So I gave up and accepted that bogofilter would continue to be
'slow'. ('slow' being a relative term. It's not good more much more
than about 100 emails per second. Of course, compared to spamassassin
at 3 seconds per email, it's blindingly fast :)

Michael.




More information about the Bogofilter mailing list