Crm114 style context matching. Phrases and partial phrases.
Greg Louis
glouis at dynamicro.on.ca
Sat May 17 12:49:11 CEST 2003
On 20030517 (Sat) at 0944:22 +0100, Anthony Clarke wrote:
> Hi,
>
> I've hobbled together a preprocessing script which allows phrases and
> partial phrases to be categorised like crm114.
>
> I don't think I have enough messages (1600 spam, 200 nonspam) to try out
> the tuning scripts and get some firm results for this.
I do. I'd be very glad to evaluate your script if you care to let me
have a copy.
> The main disadvantage is that wordlists expand considerably.
They would, of course. CRM114 trains exclusively on error for that
reason. Performance becomes an issue too, I suspect. But those are
challenges we might be able to deal with if the method looks like a
major win.
--
| G r e g L o u i s | gpg public key: finger |
| http://www.bgl.nu/~glouis | glouis at consultronics.com |
| http://wecanstopspam.org in signatures fights junk email |
More information about the Bogofilter
mailing list