Anybody seen this?

Ben Rosengart br at panix.com
Wed Sep 18 00:03:09 CEST 2002


On Tue, Sep 17, 2002 at 04:02:43PM -0500, Eric Seppanen wrote:
> On Tue, Sep 17, 2002 at 04:50:36PM -0400, Paul Tomblin wrote:
> > It's a explanation of what the original Paul Graham paper got wrong:
> > http://radio.weblogs.com/0101454/stories/2002/09/16/spamDetection.html
> 
> Note that there are two algorithms in bogofilter, and this paper deals 
> with only the second.
> 
> The first algorithm outputs a "spamicity" value for a given word, using 
> information gleaned from two or more wordists.
> 
> The second algorithm combines some of those numbers to output a 
> "spamicity" value for the whole message.

I don't believe that is correct.  The paper addresses per-word
spamicity under the "Further Improvement" headings.

-- 
Ben Rosengart     (212) 741-4400 x215

Microsoft has argued that open source is bad for business, but you
have to ask, "Whose business?  Theirs, or yours?"    --Tim O'Reilly

For summay digest subscription: bogofilter-digest-subscribe at aotto.com



More information about the Bogofilter mailing list