ext3fs slowness -- how things proceed
Greg Louis
glouis at dynamicro.on.ca
Wed Feb 5 13:01:05 CET 2003
On 20030205 (Wed) at 1220:47 +0100, Matthias Andree wrote:
> Matt Armstrong schrieb am Tuesday, den 04. February 2003:
>
> > I do wonder if ordering the writes to the db would help. This is the
> > idea I've brought up before. If we write keys to the db in the same
> > order that the db sorts them, theoretically, no single page in the BDB
> > would be written to disk more than once -- though this depends on how
> > BDB is written.
>
> Some pages will be rewritten, but few compared to what we rewrite now.
>
> The question is if it's more efficient to lean towards BerkeleyDB
> caching even if it can't have the whole DB in the cache or if we try to
> tune our write order first.
>
When working with a 20-million-byte (19 Mb or so) database, I found
that a cache of 17Mb was sufficient to support the minimum execution
time of 18 seconds (building from scratch using bogoutil with tokens in
random order). 16.5 Mb cache, 1 minute 5 seconds. 16.0 Mb cache, 3
minutes odd. 15 Mb cache, six and a half minutes. 10 Mb, six and a
half minutes (maybe a couple seconds longer than 15Mb). 256 Kb, the
default, took 26 minutes. All this on ext3, data=ordered.
I suspect that without write ordering, the scatter is too great for
anything but a near-full-size cache. I'm running my production
bogofilter at work with a 25Mb cache because the goodlist is over 30
million bytes there.
--
| G r e g L o u i s | gpg public key: |
| http://www.bgl.nu/~glouis | finger greg at bgl.nu |
| Help free our mailboxes. Include |
| http://wecanstopspam.org in your signature. |
More information about the bogofilter-dev
mailing list