multiple wordlists

David Relson relson at osagesoftware.com
Sun Mar 16 14:17:05 CET 2003


At 07:08 AM 3/16/03, Greg Louis wrote:

>On 20030315 (Sat) at 1729:43 -0500, David Relson wrote:
>
> > 4 - Weight the two lists, i.e. give higher importance to token values in
> > the user list.  The relevant formula would be:
> >
> >     p(w,weighted) = (W*p(w,user) + p(w,site))/(W+1)
> > I think that option 4 doesn't extend
> > well beyond 2 list.
>
>Why not?  In the two-list case, it makes sense to give the site list a
>weight of 1 and not mention it.  In the case of more than two lists,
>assuming there's a reason to combine them with weighting (which there
>may or may not be), you generalize:
>
>   p(w,weighted) = sum(i)(W(i) * p(w,i)) / sum(i)(W(i))

I was thinking of W as a global weighting constant.  Instead each wordlist 
pair could have its own weight.





More information about the Bogofilter mailing list