better Bayesian bogofilter

Boris 'pi' Piwinger 3.14 at logic.univie.ac.at
Tue Aug 12 14:20:27 CEST 2003


Greg Louis wrote:

>> > Is it correct to assume, that this would come with virtually
>> > no cost, since the more complicated calculation is not done
>> > very often in each run of bogofilter?
>> 
>> I think that's safe to assume.
>> 
> By "more complicated calculation" is meant "equation #5", right? 

Yes.

> Yeah, just once per token ;) 

So only a couple of hundred more divisions per mail.

> On the other hand, if you train with every
> message or randomly select errors-and-unsures to keep the ratio right,
> you get to use equation #4, which saves 3 divisions per token over what
> we do now.

I guess, most people (including me) don't care about the
ratio in the database.

pi





More information about the Bogofilter mailing list