[cvs] bogofilter/src mime.c,1.33,1.34
Evgeny Kotsuba
evgen at shatura.laser.ru
Tue Jan 4 10:11:38 CET 2005
Matthias Andree wrote:
>Evgeny Kotsuba <evgen at shatura.laser.ru> writes:
>
>
>
>>Well, I don't what to disput - as for me more readable is sizeof(), you
>>like strlen() - let it be, but then use constat int ;-)
>>I agree that this is very little effect on optimization.
>>On the other hand I am not satisfied with 10 messages/sec at Athon and
>>can't find were those inner layers for optimization are.
>>
>>
>
>/Usually/ the limiting factor is I/O speed, and that you won't change
>with any optimization. CPUs have become pretty fast, sequential
>throughput of drives has improved, but random access is the
>bottleneck. Server disk drives rotate faster, with shorter strokes,
>making more noise, to improve the number of synchronous operations, and
>have more sophisticated queueing (SCSI tagged command queueing).
>
I use JFS and JFS cache is 200Mb, the data base is no more than 50Mb,
so there are no limiting disk operation. Medium message size is
about 10-20kb, so 200kb/sek input also is not limiting.
>To find out where the program spends its time, use a decent profiling
>tool, Linux has oprofile, Sun touts Solaris 10's DTrace, and there are
>other tools.
>
>Even on older machines, strace with timestamps enabled (-tt or -ttt) may
>give hints. If it spends a lot of time in open, read, write, fsync,
>close, ... you know optimizing the code will help nothing. In that case,
>only changing algorithms to reduce the number of synchronous I/O
>operations can help then.
>
>
Well, I have found one thing - that on binary attachments all input
strings are pass through all lexer and decoding
Look at lexer.c -> yyinput()
//extern mime_t *msg_state;
if(msg_state)
{ if(msg_state->mime_disposition)
{ if(msg_state->mime_type == MIME_APPLICATION ||
msg_state->mime_type == MIME_IMAGE)
return (count == EOF ? 0 : count); //not decode at all
}
}
This is attempt to drop decoding, but really we need do it somewere
before, just after string is read.
I have already ask about bogofilter's speed in real environment, i.e.
on per message basis - but nobody answered me.
SY,
EK
More information about the bogofilter-dev
mailing list