Using the -u option and database size

John G Walker johngwalker at tiscali.co.uk
Thu Mar 22 19:47:07 CET 2007



On Thu, 22 Mar 2007 13:30:08 -0500 Bill McClain
<wmcclain at salamander.com> wrote:

> On Thu, 22 Mar 2007 13:23:11 -0500
> Tom Anderson <tanderso at oac-design.com> wrote:
> 
> > Of course bogofilter can already add a header.  It's called
> > "X-Bogosity" and the value is Unsure, Spam/Yes, Ham/No, or however
> > you have it set in your bogofilter.cf.  If your X-Bogosity is
> > Unsure, then it was not registered, otherwise it was.  Just do a
> > simple regex.
> 
> Not when using "-u" and "thresh_update". A message may be ham or spam
> and not registered.
> 
> You could compare spamicity in "X-Bogosity" to the thresh_update
> value, if the script could handle floating point.
> 
> -Bill

How accurate and precise do you need to be? If we're talking thousands
of emails (which we are over the longer term) then a few that have been
wrongly registered aren't going to make much difference. In the long
run, because you're continuously retraining, things will sort
themselves out.

You only need to make sure you don't create a long-term bias in your
training strategy,

-- 
 All the best,
 John



More information about the Bogofilter mailing list