database copying and compacting
Matthias Andree
matthias.andree at gmx.de
Sat Nov 13 00:38:22 CET 2004
David Relson <relson at osagesoftware.com> writes:
> Matthias,
>
> Sanity check please :-)
>
> The copying and compacting of databases has gotten more complex with the
> new release. BerkeleyDB's Transaction capability generates log files
> which need to be included when copying and compacting. I suspect I'll
> have db_copy and db_compact scripts before much longer. Before I go too
> far in that direction, I wanted to check my understanding with you.
>
> 1) With 0.92.8, database copying was as simple as:
>
> cp $ORIG/wordlist.db $NEW/
> Now, with 0.93.0 it's necessary to save log files and use dd (with
> proper block size) when copying the database. Thus copying becomes:
>
> SIZE=`db_stat -h $ORIG -d wordlist.db | grep "page size" | cut -f 1`
> cp $SRC/log* $SRC/__db.* $DST
> for FILE in $SRC/*.db ; do
> dd bs=$SIZE if=$FILE of=$DST/`basename $FILE`
> done
>
> The for loop supports multiple databases, e.g. wordlist.db and
> ignore.db, in $SRC
Right. Only you'd run SIZE inside the loop, with quoting:
set -e
cp "$SRC"/log.* "$SRC"/__db.* "$DST"
for FILE in "$SRC"/*.db ; do
SIZE=`db_stat -d "$FILE" | grep "page size" | cut -f 1`
dd bs=$SIZE if="$FILE" of="$DST"/`basename "$FILE"`
done
The dd is there so that the database can be recovered if the copying
process fails.
I'd think if you don't want the logs, you can do this instead:
set -e
db_checkpoint -1h "$SRC"
for FILE in "$SRC"/*.db ; do
SIZE=`db_stat -d "$FILE" | grep "page size" | cut -f 1`
dd bs=$SIZE if="$FILE" of="$DST"/`basename "$FILE"`
done
That's it. The environment will be recreated from scratch in the new
location, "$DST".
> 2) With 0.92.8, database compacting (with backup) looked like:
>
> bogoutil -d $ORIG/wordlist.db > $NEW/wordlist.txt
> bogoutil -l $NEW/wordlist.db < $NEW/wordlist.txt
> mv $ORIG/wordlist.db $ORIG/wordlist.db.orig
> mv -f $NEW/wordlist.db $ORIG/wordlist.db
In traditional mode. For concurrent mode, the same procedure as for
transactional mode applies.
> Now, with 0.93.0 it's:
>
> bogoutil -d $ORIG/wordlist.db > $NEW/wordlist.txt
> bogoutil -l $NEW/wordlist.db < $NEW/wordlist.txt
> mv $ORIG/wordlist.db $ORIG/wordlist.db.orig
> mv -f $NEW/wordlist.db $ORIG/wordlist.db
What's this good for? You can't just copy a wordlist.db file into an
existing environment, it'll get confused and I'd rather not see the
consequences. Just omit the mv.
The rest isn't necessary.
> cd $NEW
> db_checkpoint -1 -h .
> rm -f `db_archive -h .`
You've moved the database file out of the directory, so these operations
are pointless.
How about:
mkdir $NEW
cp $ORIG/DB_CONFIG $NEW || true
bogoutil -d $ORIG/wordlist.db > $NEW/wordlist.txt
bogoutil -l $NEW/wordlist.db < $NEW/wordlist.txt
rm $NEW/wordlist.txt
db_checkpoint -1h $NEW
db_archive -dh $NEW
mv $ORIG $ORIG.old
mv $NEW $ORIG
> As I understand it, db_checkpoint ensures that the log file contents are
> included in the database (as far as possible) and db_archive lists log
> files that can be deleted.
True. db_archive -d will remove them.
--
Matthias Andree
More information about the Bogofilter
mailing list