What did I do wrong?

David Relson relson at osagesoftware.com
Thu Feb 19 23:31:10 CET 2004


On Thu, 19 Feb 2004 11:04:44 -0500 (EST)
tallison at tacocat.net wrote:

> OK, I'll be the one to ask.
> 
> Why would a dump/load, without doing anything else to the data, make
> the database smaller?

BerkeleyDB uses a sorted tree structured to keep track of data.  When an
entry is added to a full block, two blocks are needed.  This results in
partially full/empty blocks.

The dump/load sequence feeds correctly ordered tokens into an empty
database.  No block splitting needs to happen.  The result is:  same
data, less storage.  As additional tokens are added (by subsequent
bogofilter runs), blocks are split and added and storage efficiency
drops.

HTH,

David




More information about the Bogofilter mailing list