Is bogofilter Bayesian?

Wed Feb 11 09:47:07 CET 2004

Greg Louis wrote:

>> Let me rephrase: We abuse the theory right from the
>> beginning (one might argue that this might in fact
>> help rather than hinder discrimination). The wording in the
>> FAQ suggests that we don't and only training on error adds a
>> (theoretical) flaw. This gives the wrong impression. Any
>> better wording is welcome.
> 
> Using Bayesian classification for email in the way we do, with full
> training, violates two assumptions on which Bayesian classification is
> based, namely, Bayesian classification would expect independence of
> tokens within messages and uniform distribution of scores.  In
> discussing training on error and training to exhaustion, we don't
> mention that. 

Right. What do you think about an FAQ entry on all of those
effects? That can list all concerns and possible
interpretations.

pi