Messages that slow bogofilter down (was: profiling)

Greg Louis glouis at dynamicro.on.ca
Thu Feb 20 23:21:32 CET 2003


On 20030220 (Thu) at 1656:47 -0500, David Relson wrote:

> Message 3.txt is essentially 100,000 x's and 4.txt is 600,000 x's.  Both of 
> these messages have an initial group of tokens followed by one monstrously 
> long token.   They could be considered extreme, pathological cases.

Unquestionably.  But legal.*  The sort of thing we must be able to
handle in the real world in case someone attempts a DoS.  (Or, as in
this case, fudges up a big attachment because there's a suspicion big
attachments are going astray and he wants to test mail handling.)

> Message 2.txt is 5MB long, which is certainly bigger than average.

It's bigger than our average, anyway, because I refuse messages larger
than 5 megabytes at work (personally I refuse anything over 1.6).  The
tendency these days is to press for higher limits, however, and we
should be designing for such limits (I _HATE_ pontificating like this
when I can't contribute meaningfully to the code, but someone's gotta
say it).

> I'm running another test set - with optimization turned on (rather than the 
> previous unoptimized, debug code).  The optimized times are much better 
> (10.90s for 2.txt, 6.66s for 3.txt, 151.13 for 4.txt).

That is better, yes.  Still over two and a half minutes for the
six-hundred-thousand-x file.  If you send a mere hundred of those to a
UP mail server that normally handles 1000 emails an hour (not a big
load) with the MTA feeding each message through bogofilter, by how much
do you slow his mail handling down?

(*)This was a plain-text attachment with a line longer than 1000
characters, so strictly speaking it is _not_ "legal."  That, however,
wouldn't stop most MTAs these days from passing it on.

> I suspect that the way to pursue this issue is to post info to the mailing 
> list and see what responses appear.

Quod feci.

-- 
| G r e g  L o u i s          | gpg public key:      |
|   http://www.bgl.nu/~glouis |   finger greg at bgl.nu |
| Help free our mailboxes. Include                   |
|        http://wecanstopspam.org in your signature. |




More information about the Bogofilter mailing list