massive disk space leak vs thresh_update
relson at osagesoftware.com
Sat Dec 11 07:10:55 EST 2004
On Sat, 11 Dec 2004 11:48:18 +0100
Matthias Andree wrote:
> David Relson <relson at osagesoftware.com> writes:
> > As thresh_update only affects folks using '-u' and as it has
> > distinct benefits, I've been thinking that "thresh_update=0.01"
> > should become part of bogofilter's default configuration.
> > What do y'all think?
> I think we should disable -u for the nonce until we have solid data on
> the "learning" that -u is supposed to do. It is not clear that this
> option actually does what we want and can easily be emulated from
> procmail or maildrop for those who still want it.
No. Disabling '-u' is a code change that would force me to run a
patched version of bogofilter and I'm unwilling to do that.
Using a non-zero value of thresh_update has a significant
effect on disk usage. It has a mid-level effect on wordlist.db
size and a major effect on logfiles.
My first thought is to default thresh_update to 0.01. However the
default spam_cutoff is 0.99 and the two factors combined would block
autoupdating of spam (but not ham). A thresh_update of 0.005 should
A second thought is to suggest adding a cron job to run db_checkpoint
and/or db_archive. People who don't want logfiles using lots of disk
space won't want to save the logfiles, so letting Berkeley DB delete
them is reasonable.
More information about the Bogofilter