the importance of robx

David Relson relson at osagesoftware.com
Sun Feb 29 17:17:20 CET 2004


On 29 Feb 2004 11:06:22 -0500
Tom Anderson wrote:

> On Sat, 2004-02-28 at 20:21, David Relson wrote:
> > Try "bogofilter -v < /dev/null" to see the score of a message with
> > no tokens..
> 
> [tanderso at www .bogofilter]$ bogofilter -v < /dev/null
> X-Bogosity: Unsure, tests=bogofilter, spamicity=0.480000,
> version=0.16.0
> 
> It's equal to my robx value... how is that possible?  There are no
> words to classify as robx, and robx is within my min_dev interval of
> 0.2. Bogofilter doesn't seem to be following its own rules.

Tom,

A message may well have no scorable tokens.  Given a large min_dev and a
message with neutral tokens, there's nothing left to score.  There's a
special check for no scorable tokens.  When this happens, the message
score is set to the value of robx.  

For this special case, the result could be changed to 0.5.  Given our
stated preference for false negatives rather than false positives, using
robx seems proper.

> > True.  But that's a starting condition, which is not typical.  Once
> > a message has been registered, tokens will start to be recognized.
> 
> What if you set your min_dev to 0.45?  Quite a few messages would have
> no significant tokens.  Theoretically speaking, they should be
> neutral, but if every single token scores as robx or otherwise between
> 0.05 and 0.95, then none of them contribute to the classification. 
> I'm assuming from my test above that the message would still score as
> robx, but how would bogofilter generate such a classification given no
> classifiable tokens?
> 
> Tom




More information about the Bogofilter mailing list