flaw in spamicity calculation
Eric Seppanen
eds at reric.net
Wed Sep 18 22:01:51 CEST 2002
On Wed, Sep 18, 2002 at 03:37:23PM -0400, David Relson wrote:
>
> Methinks it's time to take a serious look at some alternate ideas for
> converting individual word probabilities into a messages spamicity.
How about this: maybe we should stop capping the per-word spamicity at
0.01 and 0.99. Let the uncapped values determine what words make it into
the extrema array. Then, before running the Bayes-like combination on the
extrema, cap the values as before, or apply a curve to the values to get
them to fit.
The capping is needed because unless you change the Bayes-like
calculation, any values very near 0 very near or greater than 1 will
override all the other values.
For summay digest subscription: bogofilter-digest-subscribe at aotto.com
More information about the Bogofilter
mailing list