Is bogofilter Bayesian?

Boris 'pi' Piwinger 3.14 at logic.univie.ac.at
Wed Feb 11 09:47:07 CET 2004


Greg Louis wrote:

>> Let me rephrase: We abuse the theory right from the
>> beginning (one might argue that this might in fact
>> help rather than hinder discrimination). The wording in the
>> FAQ suggests that we don't and only training on error adds a
>> (theoretical) flaw. This gives the wrong impression. Any
>> better wording is welcome.
> 
> Using Bayesian classification for email in the way we do, with full
> training, violates two assumptions on which Bayesian classification is
> based, namely, Bayesian classification would expect independence of
> tokens within messages and uniform distribution of scores.  In
> discussing training on error and training to exhaustion, we don't
> mention that. 

Right. What do you think about an FAQ entry on all of those
effects? That can list all concerns and possible
interpretations.

pi




More information about the Bogofilter mailing list