singletons
Jef Poskanzer
jef at acme.com
Sat Dec 27 05:57:00 CET 2003
I mentioned last week that I'd like to see bogofilter use the count
of singletons in a message as a factor in determining bogosity.
I didn't have any concrete suggestions on how to do this, though.
Well, I just thought of one: every time a previously unseen token
gets scanned, also generate an artificial token called "Singleton"
or something like that. Then just let the regular Bayesian statistics
operate.
This might be something to try in the post 1.0 era, along with
two-token sequences.
---
Jef
Jef Poskanzer jef at acme.com http://www.acme.com/jef/
More information about the Bogofilter
mailing list