Using the -u option and database size
Bill McClain
wmcclain at salamander.com
Wed Mar 21 12:27:22 CET 2007
On Wed, 21 Mar 2007 10:42:41 +0100
Peter Gutbrod <lists at media-fact.com> wrote:
> So far I used the -u option with bogofilter. Meanwhile my wordlist.db has
> grow up to about 200 MB and I'm wondering, whether it puts much load onto
> the server to match each mail against such a big database.
>
> I think the size is mainly due to the automatic registering with the -u
> option.
>
> So what do you think? Is it better not to use the -u option to keep the
> database small? Or so you think a 200MB database is not a problem even on a
> production mailserver that is receiving thousands of (spam) email every day.
You might look into the "threash_update" parameter:
# Skip autoupdating if the spamicity is within this value
# of 0.000000 (surely ham) or 1.000000 (surely spam).
I use the default of 0.01, meaning messages with spamicity greater than 0.99
and hams less than 0.01 are not registered. This cuts down autoupdate
registrations by a large factor; maybe 1/10?
The idea is that well-recognized messages do not need to be registered. You
do miss some new tokens and counts that might be useful in the future, but in
practice I've found that accuracy is not harmed.
-Bill
--
Sattre Press The King in Yellow
http://sattre-press.com/ by Robert W. Chambers
info at sattre-press.com http://sattre-press.com/kiy.html
More information about the Bogofilter
mailing list