"url:" counts
David Relson
relson at osagesoftware.com
Thu Jan 8 20:28:29 CET 2004
On Thu, 08 Jan 2004 14:18:19 -0500
Matt Garretson <mattg at assembly.state.ny.us> wrote:
> David Relson wrote:
> > Prompted by Matt's comment on the misnaming of "url:" tokens, I
> > counted what's in my database and how many have very low or very
> > high scores.
>
>
> FWIW, here are my values:
>
> (note that my corpus' ham/spam ratio is about 1/2 )
>
> count score
>
> 9,734 < 0.01
> 67,798 >= 0.99
>
> 936 < 0.001
> 4,049 >= 0.999
>
> 79,551 "url:" tokens
> (61,332 of these are singletons)
>
> 829,442 total tokens
Lots of strong indicators, eh?
More information about the Bogofilter
mailing list