CRM114 Discriminator

Greg Louis glouis at dynamicro.on.ca
Tue Feb 10 15:55:40 CET 2004


On 20040210 (Tue) at 0930:53 -0500, Tom Anderson wrote:
> http://crm114.sourceforge.net/
> 
> I think you guys would be interested in reading about this.  I'm not
> sure I understand the method completely, but it's related to but
> different from Bayesian filtering.  The FAQ also makes a strong case for
> training-on-error.
> 
Been following Bill Yerazunis's excellent project for several months;
in fact, since late in 2002.  The two main differences from bogofilter,
spambayes et al. are the use of phrases and the application of the
Bayesian chain rule.  This makes for somewhat slow but very accurate
classification.  Because of the humungous training db that would
otherwise result, crm114 pretty much has to be trained on error. 
Applying the chain rule gives a pure binary output (no unsures); for a
look at using it without phrases, see
  http://www.bgl.nu/bogofilter/BcrFisher.html
where I compare the chain rule approach to Fisher.

-- 
| G r e g  L o u i s         | gpg public key: 0x400B1AA86D9E3E64 |
|  http://www.bgl.nu/~glouis |   (on my website or any keyserver) |
|  http://wecanstopspam.org in signatures helps fight junk email. |




More information about the Bogofilter mailing list