bf_compact

Matthias Andree matthias.andree at gmx.de
Fri May 13 18:28:28 CEST 2005


R Kimber <rkimber at ntlworld.com> writes:

> On Thu, 12 May 2005 20:59:20 -0400
> David Relson <relson at osagesoftware.com> wrote:
>
>> The purpose of "bf_compact dir" is to compact the wordlist(s) in
>> directory "dir".  For database security, a new directory is created to
>> hold the newly compacted wordlists.  Once the compacting is done, the
>> original directory, i.e. "dir", is renamed to "dir.old" and the new
>> directory is renamed to "dir".  A special check is included for file
>> DB_CONFIG and it's copied to the new directory.  Other files in "dir"
>> are left unchanged, hence end up in "dir.old".
>> 
>> Given the directory renaming, "bf_compact ." is problematical :-<
>
> Maybe this is because the directory structure bogofilter uses isn't
> optimal.  Instead of working in the ~/.bogofilter directory, perhaps the
> stuff should be in subdirectories of this.  Thus one could have two
> subdirectories: wordlist.active and wordlist.old

I don't think moving the default directory is a wise choice at this
time. The software has been using ~/.bogofilter for as long as I can
remember, and bf_compact is not something that I'd run
regularly. Particularly, I am inclined to consider the ~/.bogofilter
directory bogofilter's property, and the "unauthorized" (by the
bogofilter documentation) storage of your own files in that directory
happens at your own risk.

Note that most of the size differences you see stem in bf_compact from
files being written sparsely, and from merging pages together that were
once split when tokens got added - however, bf_compact causes more pages
splits to become necessary in the future, which may adversely affect
registration performance. I don't have figures handy though.

Conclusion: If you are offended by the way bf_compact works, don't use
it. Now that bogofilter can automatically remove log files if running in
transactional mode, and bogoutil --db-prune=DIRECTORY can manually
purge excess log files, there's little reason to use bf_compact at all.

-- 
Matthias Andree



More information about the Bogofilter mailing list