DB corruption within minutes
gyepi at praxis-sw.com
Sat Jan 11 14:14:10 EST 2003
On Sat, Jan 11, 2003 at 01:59:05PM +0100, Matthias Andree wrote:
> On Sat, 11 Jan 2003, Gyepi SAM wrote:
> > > I have added t.lock2 and made minor fixes to bogofilter.
> > I noticed. I also change the grind loop to run for 1000 iterations,
> > and found no problems.
> Even 8 did the job on my system SuSE Linux 8.1, Duron/700? UWSCSI hard
> drive, DB-4.0.14, Kernel 2.4.19, ext3fs.
> What's the difference to your system? Slower machine? Slower hard drive?
> Faster machine? different OS? different DB version?
Redhat 7.1, Athlon 1.3G IDE drives, kernel 2.4.19,ext2
The used to be slower (PII -400) and I have tested with both
db 3.17 and 4.0.14 with the same result.
> > The only solutions I can think of are
> > 1. call open (2) on the database ourselves, so we have a handle to lock
> > 2. use an external lockfile.
> An external lockfile is the global lock we don't want to use for
> scalability reasons. I've also though about integrating the locking with
It would not be a global lock file; there would be one per database and it will be locked the same way we lock the databases now, except that we could
unlock it after the database is closed. I still think solution 1 is better though. Fewer changes too.
> > > Plus, I believe we cannot release the lock, have someone else update the
> > > db and then grab the lock again to proceed. The pages may have changed,
> > > so we have inconsistent cache/disk data.
> > If we call db_sync() after updating the database but before releasing
> > the lock, that should fix any syncronization problems of that sort.
> That's only one half. The other half would be to make the other data
> bases that have waited for the lock flush /their/ caches, but I don't
> currently see how that would be done other than with DB->close and
I don't think that's necessary, since the other databses that have waited
for the lock would not have modified their databases before acquiring the
lock so there should be nothing to flush.
> It's sort of euhm unhelpful if you cannot reproduce the problem, because
> it seems you have most experience with BDB of all active bogofilter
> hackers. I seem to have grasped the basics though.
Yes, it does seem strange that I cannot reproduce the problem, but I'll keep
working on it.
More information about the Bogofilter-dev