Database security with training only once in a while

David Relson relson at osagesoftware.com
Sun Jan 16 16:06:52 CET 2005


On Sun, 16 Jan 2005 14:49:06 +0100
Boris 'pi' Piwinger wrote:

> David Relson <relson at osagesoftware.com> wrote:
> 
> >Using bogofilter with the new transaction code isn't all that hard.  The
> >big difference is that a database is no longer a single file, but is now
> >a directory (called a "database environment") with several files for
> >locking, logging, configuration, etc.  The environment makes it possible
> >for multiple programs to simultaneously read and write the database,
> >something which isn't possible in old versions of bogofilter.
> 
> Well, as I said write access is not an issue for me. Read
> access so far has never been a problem either.
> 
> >The config file DB_CONFIG is needed when the database is large to let
> >Berkeley DB know how many locks to allocate.  Copying the environment is
> >no longer a simple "cp" command, as there are log files and a program
> >might be writing the database when you want to copy it.  
> 
> Not in my case.
> 
> >We've added
> >scripts to handle the tricky parts, i.e. bf_copy, bf_compact, bf_tar,
> >and bf_resize.
> 
> bf_compact might be the solution for bogominitrain.pl. Does
> bf_copy allow a forced mode for replacing a database?

bf_compact compacts an old environment (wordlist) into a new
environment, then uses the "mv" command.  The net effect is that
~/.bogofilter is renamed to ~/.bogofilter.old and the new ~/.bogofilter
has the compacted wordlist.

bf_copy just copies the wordlist from one environment to another.  Since
you're "keeping it simple" (not using simultaneous reads and writes),
you should be able to use it.  I suspect the scenario would be (roughly):

   bf_copy official.dir training.dir
   bogominitrain.pl training.dir
   bf_copy training.dir official.dir

> 
> >If you don't want the new capability, configure has a
> >"--disable-transactions" option.  Building with that option leaves opens
> >a timing window that may cause database corruption if the database is
> >being written at the time of a system (or bogofilter) crash.  Of course
> >you've been dealing with that possibility for a couple of years :-)
> 
> Indeed, not in problem in my case, though.
> 
> >The transaction code has become much easier to deal with as bogofilter
> >has gone from 0.93.0 to 0.93.4.  It's in good shape now!
> 
> When do we expect the next stable version? This would be a
> time for me to start using it and change bogominitrain.pl.

That's uncertain :-<  Berkeley DB with transactions provides an
environment with better protection against database problems, which is
good.  However, the environment is more complex and there's an increased
learning curve for database maintenance.  Transaction support also uses
more disk storage for the log files.  It still needs to be decided
whether the default configuration should be transaction or
non-transaction.  I'll post a write-up/query/survey in a while.

David



More information about the Bogofilter mailing list