spam levels [was: Including html-tag contents ...]

Tony L. Svanstrom tony at moon.pp.se
Tue May 13 20:44:39 CEST 2003


On Mon, 12 May 2003 the voices made David Relson write:

DR> I'm still not clear on your idea.  It sounds like you want to change a
DR> percentage of the token scores to 0.0 or to 1.0.
DR> So "-u 1" would take the lowest 30% of the hammish tokens and change their
DR> scores to 0.0, i.e. "pure ham", and that "-u 4" would take the highest 20%
DR> and set them to 1.0???

 Almost, instead of replacing any tokens these would be added with a
max/minimum score.

DR> Given a message of 100 words, suppose 50 were eliminated by MIN_DEV and of
DR> the remaining 50 words, 20 had hammish scores (less than 0.50) and 30 had
DR> spammish scores (above 0.50).  What effect would values "-U1", "-U2", etc
DR> have???  (Note: since "-u" is in use and "-U" is not used, I'm using "-U"
DR> to designate the "Unsure" switch.)

 The spammish/hammish tokens would be added on top of the existing ones, so
you'd end up with this many tokens.

 100 words/tokens, 50 MIN_DEV-eliminated, 30 spammish and 20 hammish tokens
from the text used; resulting in this many spammish/hammish tokens to be used:

	Spammish	Hammish
 -U1	30		40
 -U2	30		30
 -U3	30		20
 -U4	40		20
 -U5	50		20


-- 
  .-------------------------------------------------------------------.
  | Per scientiam ad libertatem! (Through knowledge towards freedom!) |
  `-------------------------------------------------------------------´
                   << ©1998-2003 tony at svanstrom.com >>





More information about the Bogofilter mailing list