a source of useful info and a question (correction)
Greg Louis
glouis at dynamicro.on.ca
Mon Nov 18 19:41:58 CET 2002
On 20021118 (Mon) at 1322:39 -0500, Greg Louis wrote:
>
> Both geometric-mean and Fisher-based calculation use
> scalefactor = badlist_messagecount / goodlist_messagecount
> in bogofilter. In spambayes, if the goodlist has more tokens,
> scalefactor = badlist_tokencount / goodlist_tokencount)
> and otherwise
> scalefactor = goodlist_tokencount / badlist_tokencount
>
> We calculate, for each token from w = 1 to n where n is the number of
> unique tokens in the message,
> f(w) = (s * x + badcount) / (s + badcount + goodcount * scalefactor)
> and spambayes does something essentially the same.
Of course, if the badlist has more tokens in spambayes, the f(w)
denominator is (s + badcount * scalefactor + goodcount)
--
| G r e g L o u i s | gpg public key: |
| http://www.bgl.nu/~glouis | finger greg at bgl.nu |
More information about the Bogofilter
mailing list