better Bayesian bogofilter

Greg Louis glouis at dynamicro.on.ca
Tue Aug 12 14:15:25 CEST 2003


On 20030812 (Tue) at 1135:28 +0200, Matthias Andree wrote:
> Boris 'pi' Piwinger <3.14 at logic.univie.ac.at> writes:
> 
> > Greg Louis <glouis at dynamicro.on.ca> wrote:
> >
> >>Conclusion
> >>
> >>The suggestions presented above under "Hypothesis" should be adopted in
> >>bogofilter.
> >
> > Is it correct to assume, that this would come with virtually
> > no cost, since the more complicated calculation is not done
> > very often in each run of bogofilter?
> 
> I think that's safe to assume.
> 
By "more complicated calculation" is meant "equation #5", right?  Yeah,
just once per token ;)  On the other hand, if you train with every
message or randomly select errors-and-unsures to keep the ratio right,
you get to use equation #4, which saves 3 divisions per token over what
we do now.

-- 
| G r e g  L o u i s         | gpg public key: 0x400B1AA86D9E3E64 |
|  http://www.bgl.nu/~glouis |   (on my website or any keyserver) |
|  http://wecanstopspam.org in signatures helps fight junk email. |




More information about the Bogofilter mailing list