database copying and compacting

Mon Nov 15 23:24:53 CET 2004

"Pavel Kankovsky" <peak at argo.troja.mff.cuni.cz> writes:

> On Sun, 7 Nov 2004, David Relson wrote:
>
>> Now, with 0.93.0 it's necessary to save log files and use dd (with
>> proper block size) when copying the database.  Thus copying becomes:
>> 
>>    SIZE=`db_stat -h $ORIG -d wordlist.db | grep "page size" | cut -f 1` 
>>    cp $SRC/log* $SRC/__db.* $DST  
>>    for FILE in $SRC/*.db ; do
>>        dd bs=$SIZE if=$FILE of=$DST/`basename $FILE`
>>    done
>
> According to Berkeley DB's "Database and log file archival"
> (http://www.sleepycat.com/docs/ref/transapp/archival.html)
> hot database backup (*) should copy db files BEFORE log files and the 
> order is important.

Check the brand new db_tar script (in CVS only yet) - it does just
that. Tar up the databases (as returned by db_archive -s) and then the
logs (as returned by db_archive -l) to stdout - and optionally remove
unneeded log after or before (not recommended) the backup.

> The requirement makes sense to me: if you copy log
> files first and db files next, you might end with db files containing 
> data from updates missing in log files and the result will be 
> unrecoverable. On the other hand, if you copy db files first and log 
> files next, all data in db files can be either commited or rolled back 
> using the information in log files (at least unless you drop some log 
> files in the middle of backup).
>
> Moreover, it might be a good idea to add something like
> db_recover -c -h $DST in order to put the destination db into a 
> consistent state.

No. bogofilter -f $DST is possible, but db_recover MUST NOT be used on
live databases. Running recovery underneath a running application wreaks
real havoc like you've never seen before. I tried that on a copy of my
database to find out how bad it really was, and it was worse than I
expected. Boom.

> (*) I assume this is what you intend to do because dd is necessary to 
> guarantee page-level read consistency wrt concurrent writes.

Yup. The actual problem was that on some systems, cp(1) used mmap(2)
which then goofed the isolation up and caused non-atomic reads. Half a
page new, the other half stale. It's half as bad when the database has
been created under BerkeleyDB 4.1 - 4.3 because we'll request page
checksums for these versions. 4.0 and older don't support checksums.

-- 
Matthias Andree