When is spam_cutoff too low?
Tom Anderson
tanderso at oac-design.com
Mon Dec 13 03:25:43 CET 2004
On Sun, 2004-12-12 at 18:38, Matej Cepl wrote:
> Tom Anderson wrote:
> > Look at all of your ham and find the highest scoring one over the past 1-3
> > months. You can set your spam cutoff to just above that value and not
> > fear getting false positives. It's still possible of course, but highly
> > unlikely. I find that by using -u to register all of my hams
> > automatically, my highest ham score is around 0.01.
>
> Would you have some tool to do get this statistics from the email corpora,
> or should I made myself some combination of grep, procmail, and other shell
> tools (or Python)?
It is a statistic I gather simply by training on error. Over time, you
should get a feel for how your email is scored. If you see any ham pop
up in your unsures, those are the ones you should look at to guide your
spam_cutoff decision. Hopefully, that will only be a handful and you
can find the highest scorer manually. If you're getting any false
positives, don't set your spam_cutoff any lower until you've trained
more.
Tom
More information about the Bogofilter
mailing list