bogofilter hogging cpu

David Relson relson at osagesoftware.com
Fri Nov 12 05:11:00 CET 2004


On Fri, 12 Nov 2004 02:53:48 +0100
Matthias Andree wrote:

> Matthias Andree <matthias.andree at gmx.de> writes:

...[snip]....

> OK,
> 
> I used oprofile to gather this data and there is little surprise -
> here's the top ten of time consumers on my Athlon XP 2500+, for
> scoring one Maildir and one mailbox (bogofilter -TM) with my
> production database:
> 
>   %   cumulative   self              self     total           
>  time   samples   samples    calls  T1/call  T1/call  name    
>  36.44  10891.00 10891.00                             yylex
>  10.98  14174.00  3283.00                            
>  msg_compute_spamicity
>   6.65  16161.00  1987.00                             yyinput
>   5.25  17729.00  1568.00                             get_token
>   5.09  19249.00  1520.00                             base64_decode
>   5.05  20759.00  1510.00                             word_cmp
>   4.67  22154.00  1395.00                             xfgetsl
>   4.16  23396.00  1242.00                            
>   wordhash_standard_insert 3.93  24572.00  1176.00                    
>           ds_read
>   2.28  25253.00   681.00                             db_get_dbvalue
> 
> So the lexer is a very important part and takes a lot of time.
> 
 Matthias,

I think you're overlooking some important details.  The problem reported
was that "top" is showing bogofilter using 40% of the CPU.  Bogofilter
is fast and, during mail delivery, shouldn't be running long enough for
top to notice.  What's been reported doesn't seem like normal bogofilter
behavior.  I'd much like to know _why_ top even sees bogofilter.

I'm not worried about time spent parsing, scoring, etc.  _Something_
needs to be the slowest part and we've long known it's yylex.

Question on oprofile:  does it showed shared library usage?  It might be
interesting to profile bogofilter-static.

Ciao,

David



More information about the Bogofilter mailing list