singletons

Boris 'pi' Piwinger 3.14 at logic.univie.ac.at
Sat Dec 27 16:01:57 CET 2003


Jef Poskanzer <jef at acme.com> wrote:

>I mentioned last week that I'd like to see bogofilter use the count
>of singletons in a message as a factor in determining bogosity.
>I didn't have any concrete suggestions on how to do this, though.
>Well, I just thought of one: every time a previously unseen token
>gets scanned, also generate an artificial token called "Singleton"
>or something like that.  Then just let the regular Bayesian statistics
>operate.

Do I understand correctly that you want to add the token
Singleton (or maybe Singleton:yes) for every massage which
uses one token not seen before? I'd guess that this is true
for many messages, almost all, if not using full training.
I'd assume it has little to no value.

On the other hand it seems you can do more by setting robx
and robs to reflect you estimate about singletons.

pi




More information about the Bogofilter mailing list