comments from a new user

Peter Bishop pgb at adelard.com
Sat May 10 09:34:48 CEST 2003


On 9 May 2003 at 23:13, Andrew Pimlott wrote:

> > Consider a token with a pgood score of 0.1.  That means it appears in 10% 
> > of the good messages that have been registered.  Given that there are a 
> > gazillion other messages in the world, your statement isn't quite right.
> 
> True, but I would think that anyone even roughly familiar with how
> bogofilter works would understand that these numbers are based on
> the messages that have been registered.
> 
> > Perhaps you can suggest a different wording ...
> 
>     "the likelihood (extrapolated from your registered non-spam
>     messages) that a non-spam message contains this token"
> 

       "The proportion of good messages that contained this token"

"proportion" is simply that - it does not claim anything about the 
probability of future messages containing the token.

I think it is obvious from the context that we are talking about messages 
registered in the database.
-- 
Peter Bishop 
Adelard and Centre for Software Reliability, City University
Drysdale Building, 10 Northampton Square, London, EC1V 0HB
Tel: +44-20-7490-9467, Fax: +44-20-7490-9451
pgb at adelard.com, http://www.adelard.com/
pgb at csr.city.ac.uk, http://www.city.ac.uk/





More information about the Bogofilter mailing list