bogofilter hogging cpu
Matthias Andree
matthias.andree at gmx.de
Fri Nov 12 02:53:48 CET 2004
Matthias Andree <matthias.andree at gmx.de> writes:
> From my communications background, I'd think the analysis part is the
> expensive one, and that would be the lexer and similar components, but
> nothing short of a profile can provide that. That means compiling with
> -pg, running a while with your test load and then checking with gprof
> where most time is spent.
OK,
I used oprofile to gather this data and there is little surprise -
here's the top ten of time consumers on my Athlon XP 2500+, for scoring
one Maildir and one mailbox (bogofilter -TM) with my production database:
% cumulative self self total
time samples samples calls T1/call T1/call name
36.44 10891.00 10891.00 yylex
10.98 14174.00 3283.00 msg_compute_spamicity
6.65 16161.00 1987.00 yyinput
5.25 17729.00 1568.00 get_token
5.09 19249.00 1520.00 base64_decode
5.05 20759.00 1510.00 word_cmp
4.67 22154.00 1395.00 xfgetsl
4.16 23396.00 1242.00 wordhash_standard_insert
3.93 24572.00 1176.00 ds_read
2.28 25253.00 681.00 db_get_dbvalue
So the lexer is a very important part and takes a lot of time.
--
Matthias Andree
More information about the Bogofilter
mailing list