flex speed

Bob Friesenhahn bfriesen at simple.dallas.tx.us
Tue Aug 5 01:39:43 CEST 2003


On 5 Aug 2003 michael at optusnet.com.au wrote:

> Bob Friesenhahn <bfriesen at simple.dallas.tx.us> writes:
>
> > This naturally brings up the subject: Has anyone profiled bogofilter
> > using a profiling tool like 'gcc -pg' and 'gprof'?
>
> Yes. You'll be unsuprised to know that the lexer accounts
> for more than 90% of the user CPU time. :)
>
> Seriously, all the work in bogofilter is in the tokenizing
> driven by the very complex token rules. If you want it to
> run faster (and I do!) then the only way out is to hand-build
> the lexer.

Perhaps once the software is mature, creating a hand-built lexer would
be a worthwhile task.

It may be that the lexer could be re-ordered/expanded to improve
performance by short-circuiting checks that don't need to be made.

> So I gave up and accepted that bogofilter would continue to be
> 'slow'. ('slow' being a relative term. It's not good more much more
> than about 100 emails per second. Of course, compared to spamassassin
> at 3 seconds per email, it's blindingly fast :)

If SpamAssassin is that slow, then bogofilter is clearly the right
choice for me.

Bob
======================================
Bob Friesenhahn
bfriesen at simple.dallas.tx.us
http://www.simplesystems.org/users/bfriesen





More information about the Bogofilter mailing list