README.ext3
Greg Louis
glouis at dynamicro.on.ca
Sun Feb 2 01:02:59 CET 2003
On 20030201 (Sat) at 1612:36 +0100, Matthias Andree wrote:
> We'd better get the performance issues fixed, or if there's a bug, we'd
> better get that reported. ext2 is way inferior to ext3 in terms of
> consistency, recovery or robustness.
Recovery I will buy; can you supply pointers to evidence for the other
two?
> Given that the performance issues
> cannot be reproduced, claiming ext3 to be slow generally is IMO
> premature. My mbox has been smaller than yours and haven't turned up
> with nearly as much tokens
Of course the performance issues cannot be reproduced if you don't
reproduce the conditions. Sheesh...
>, so it might really be a tuning issue or an
> issue with the kernel version that you're using.
Yes, that is true, it might. We need to know. What kernel version are
_you_ using? -- I have one machine I could borrow to check that.
> Plus, priming the data
> base with some training data is an operation that isn't performed very
> often, so we can live with that.
The email to which you were replying mentioned that with ext3 a
_classification_ takes me four times as long as if the db files are on
ext2. Maybe _you_ can live with that...
> I'm very chary about recommending people to turn consistency guarantees
> off, I have learnt BDB isn't very robust against corruption, and if
> something goes wrong, user should at least notice.
Yes, I agree with you here. Some sort of external recovery strategy is
definitely needed if the db files are on ext2 or ext3/writeback. As
in, "he who laughs last probably made a backup." Probably the warning
I gave is worded too gently.
> > 3. With ext3 in the data=journal mode (all data are committed to the
> > journal prior to being written into the main file system)
> >
> > # umount /xtrn
> > # mount -t ext3 -o data=journal /dev/scd1 /xtrn
> > # rm -f /xtrn/db/*
> > # time /lighter/usr/bin/bogo10 -d /xtrn/db -v -s <spam_corpus
> > # 5868782 words, 14502 messages
> >
> > real 14m11.143s user 2m34.430s sys 0m45.170s
>
> This is really some interesting data point, essentially, this means that
> BDB might do many more synchronous operations than we are aware of given
> this only takes half the time of data=writeback.
Of data=ordered. Writeback is only slightly slower than ext2. Just a
typo, no doubt. I too was interested that journal is almost twice as
fast as ordered, but I don't know enough about db internals to turn
that to advantage.
--
| G r e g L o u i s | gpg public key: |
| http://www.bgl.nu/~glouis | finger greg at bgl.nu |
| Help free our mailboxes. Include |
| http://wecanstopspam.org in your signature. |
More information about the Bogofilter
mailing list