Spam / ham registration issue
David Relson
relson at osagesoftware.com
Wed Mar 3 14:25:18 CET 2004
On Thu, 4 Mar 2004 00:07:45 +1100
Tig wrote:
...[snip]...
> Thanks heaps for the reply people. My understanding now is: Each word
> in the test case has been registered as spam and ham, so therefore
> balance out and give a neutral result. It does not matter how many
> times a word is registered as spam or ham, just the fact that it has
> been recorded as either or both.
>
> Would this be a correct summary?
Pretty much. The basic principle is comparing the likelihod of the word
being
in spam to the word being in ham. You've maxed out both of them :-)
An alternate view of the world would use message counts rather than
percents of words in messages. The alternate view could give us "I get
5
times as much spam as ham, so the odds are 5::1 that the next message is
spam."
David
More information about the Bogofilter
mailing list