Anybody seen this?
Ben Rosengart
br at panix.com
Wed Sep 18 00:03:09 CEST 2002
On Tue, Sep 17, 2002 at 04:02:43PM -0500, Eric Seppanen wrote:
> On Tue, Sep 17, 2002 at 04:50:36PM -0400, Paul Tomblin wrote:
> > It's a explanation of what the original Paul Graham paper got wrong:
> > http://radio.weblogs.com/0101454/stories/2002/09/16/spamDetection.html
>
> Note that there are two algorithms in bogofilter, and this paper deals
> with only the second.
>
> The first algorithm outputs a "spamicity" value for a given word, using
> information gleaned from two or more wordists.
>
> The second algorithm combines some of those numbers to output a
> "spamicity" value for the whole message.
I don't believe that is correct. The paper addresses per-word
spamicity under the "Further Improvement" headings.
--
Ben Rosengart (212) 741-4400 x215
Microsoft has argued that open source is bad for business, but you
have to ask, "Whose business? Theirs, or yours?" --Tim O'Reilly
For summay digest subscription: bogofilter-digest-subscribe at aotto.com
More information about the Bogofilter
mailing list