Spam / ham registration issue

David Relson relson at osagesoftware.com
Wed Mar 3 14:25:18 CET 2004


On Thu, 4 Mar 2004 00:07:45 +1100
Tig wrote:

...[snip]...

> Thanks heaps for the reply people. My understanding now is: Each word
> in the test case has been registered as spam and ham, so therefore
> balance out and give a neutral result. It does not matter how many
> times a word is registered as spam or ham, just the fact that it has
> been recorded as either or both.
> 
> Would this be a correct summary?

Pretty much.  The basic principle is comparing the likelihod of the word
being
in spam to the word being in ham.  You've maxed out both of them :-)

An alternate view of the world would use message counts rather than
percents of words in messages.  The alternate view could give us "I get
5
times as much spam as ham, so the odds are 5::1 that the next message is
spam."

David




More information about the Bogofilter mailing list