Spam / ham registration issue
Tom Anderson
tanderso at oac-design.com
Wed Mar 3 14:38:02 CET 2004
On Wed, 2004-03-03 at 08:25, David Relson wrote:
> Pretty much. The basic principle is comparing the likelihod of the word
> being in spam to the word being in ham. You've maxed out both of them :-)
So registering _other_ hams and spams not having these tokens would tend
to have more effect than registering this same one over and over?
> An alternate view of the world would use message counts rather than
> percents of words in messages. The alternate view could give us "I get
> 5 times as much spam as ham, so the odds are 5::1 that the next message is
> spam."
Although it sounds almost reasonable, it fails for the same reason as
racial profiling. The innocent ones get harassed unduly. Biasing 5:1
toward spam on each email would lead to an inordinate amount of false
positives.
Tom
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: This is a digitally signed message part
URL: <http://www.bogofilter.org/pipermail/bogofilter/attachments/20040303/21477e64/attachment.sig>
More information about the Bogofilter
mailing list