Database security with training only once in a while

David Relson relson at osagesoftware.com
Sun Jan 16 14:34:36 CET 2005


Hi pi,

Using bogofilter with the new transaction code isn't all that hard.  The
big difference is that a database is no longer a single file, but is now
a directory (called a "database environment") with several files for
locking, logging, configuration, etc.  The environment makes it possible
for multiple programs to simultaneously read and write the database,
something which isn't possible in old versions of bogofilter.

The config file DB_CONFIG is needed when the database is large to let
Berkeley DB know how many locks to allocate.  Copying the environment is
no longer a simple "cp" command, as there are log files and a program
might be writing the database when you want to copy it.  We've added
scripts to handle the tricky parts, i.e. bf_copy, bf_compact, bf_tar,
and bf_resize.

If you don't want the new capability, configure has a
"--disable-transactions" option.  Building with that option leaves opens
a timing window that may cause database corruption if the database is
being written at the time of a system (or bogofilter) crash.  Of course
you've been dealing with that possibility for a couple of years :-)

The transaction code has become much easier to deal with as bogofilter
has gone from 0.93.0 to 0.93.4.  It's in good shape now!

HTH,

David

On Sun, 16 Jan 2005 11:37:09 +0100
Boris 'pi' Piwinger wrote:

> Hi!
> 
> I still have not upgraded to the new versions since I did
> not find to understand all the tricks and changes needed to
> keep on running (yes, this is my fault, but anyway, that's
> how it is). For the same reason I have not yet updated
> bogominitrain.pl to work with the new version; my
> understanding is that it works except for compacting the
> database, so this problem is not too bad.
> 
> Now I'm wondering how I should use the new version. Due to
> the way I train, I only train once every few days from one
> process on a copy of the database. So there is no danger of
> problems due to locking failure or something like this. Also
> I do have a copy (actually the production database) which is
> replaced only after successfully doing the new training.
> After that I have two copies of the database. What can
> happen is a disk failure, but then I will not be able to use
> the new functionality to repair the database. I do have
> copies of my mail files on another machine, so in this case
> I can build the database from scratch.
> 
> Now: Is there any reason for me to use the new feature? It
> will just blow up the size of the database, right?
> 
> Am I right that I should still be able to replace the
> production database by the training database by replacing
> all the files?
> 
> pi




More information about the Bogofilter mailing list