bogotune and "exhaustion"
David Relson
relson at osagesoftware.com
Mon Mar 29 14:29:04 CEST 2004
On Mon, 29 Mar 2004 07:02:20 -0500
Tom Allison wrote:
> Ran into a cute catch-22.
>
> bogotune wants a sample size that includes some high scoring ham and
> low scoring spam (maybe) to get a good calculation of what to set the
> parameters at.
>
> run corrections to exhaustion tends to remove that high scoring ham,
> giving you a big fat remark and a shortened bogotune output.
>
> So, it seems that I can do one or the other but not both on my
> archives. Or I'm doing something wrong.
Hi Tom,
What you say sounds reasonable. Bogotune needs the variation in ham
scores so that it can adjust the number of false positives. I can
believe that train-to-exhaustion conflicts with this. It's also
possible that bogotune could be modified to not need the high-scoring
ham. This possibility will need some thought since a different way of
selecting spam_cutoff will be needed. I'll think about it, but can't
guarantee a solution.
David
More information about the Bogofilter
mailing list