bogofilter hogging cpu
David Relson
relson at osagesoftware.com
Fri Nov 12 05:11:00 CET 2004
On Fri, 12 Nov 2004 02:53:48 +0100
Matthias Andree wrote:
> Matthias Andree <matthias.andree at gmx.de> writes:
...[snip]....
> OK,
>
> I used oprofile to gather this data and there is little surprise -
> here's the top ten of time consumers on my Athlon XP 2500+, for
> scoring one Maildir and one mailbox (bogofilter -TM) with my
> production database:
>
> % cumulative self self total
> time samples samples calls T1/call T1/call name
> 36.44 10891.00 10891.00 yylex
> 10.98 14174.00 3283.00
> msg_compute_spamicity
> 6.65 16161.00 1987.00 yyinput
> 5.25 17729.00 1568.00 get_token
> 5.09 19249.00 1520.00 base64_decode
> 5.05 20759.00 1510.00 word_cmp
> 4.67 22154.00 1395.00 xfgetsl
> 4.16 23396.00 1242.00
> wordhash_standard_insert 3.93 24572.00 1176.00
> ds_read
> 2.28 25253.00 681.00 db_get_dbvalue
>
> So the lexer is a very important part and takes a lot of time.
>
Matthias,
I think you're overlooking some important details. The problem reported
was that "top" is showing bogofilter using 40% of the CPU. Bogofilter
is fast and, during mail delivery, shouldn't be running long enough for
top to notice. What's been reported doesn't seem like normal bogofilter
behavior. I'd much like to know _why_ top even sees bogofilter.
I'm not worried about time spent parsing, scoring, etc. _Something_
needs to be the slowest part and we've long known it's yylex.
Question on oprofile: does it showed shared library usage? It might be
interesting to profile bogofilter-static.
Ciao,
David
More information about the Bogofilter
mailing list