multiple wordlists
David Relson
relson at osagesoftware.com
Sun Mar 16 14:17:05 CET 2003
At 07:08 AM 3/16/03, Greg Louis wrote:
>On 20030315 (Sat) at 1729:43 -0500, David Relson wrote:
>
> > 4 - Weight the two lists, i.e. give higher importance to token values in
> > the user list. The relevant formula would be:
> >
> > p(w,weighted) = (W*p(w,user) + p(w,site))/(W+1)
> > I think that option 4 doesn't extend
> > well beyond 2 list.
>
>Why not? In the two-list case, it makes sense to give the site list a
>weight of 1 and not mention it. In the case of more than two lists,
>assuming there's a reason to combine them with weighting (which there
>may or may not be), you generalize:
>
> p(w,weighted) = sum(i)(W(i) * p(w,i)) / sum(i)(W(i))
I was thinking of W as a global weighting constant. Instead each wordlist
pair could have its own weight.
More information about the Bogofilter
mailing list