singletons

Jef Poskanzer jef at acme.com
Sat Dec 27 05:57:00 CET 2003


I mentioned last week that I'd like to see bogofilter use the count
of singletons in a message as a factor in determining bogosity.
I didn't have any concrete suggestions on how to do this, though.
Well, I just thought of one: every time a previously unseen token
gets scanned, also generate an artificial token called "Singleton"
or something like that.  Then just let the regular Bayesian statistics
operate.

This might be something to try in the post 1.0 era, along with
two-token sequences.
---
Jef

         Jef Poskanzer  jef at acme.com  http://www.acme.com/jef/




More information about the Bogofilter mailing list