Wordlist Histogram [was: What did I do wrong? ]

David Relson relson at osagesoftware.com
Thu Feb 19 22:54:07 CET 2004


On Thu, 19 Feb 2004 14:51:21 +0100
Boris 'pi' Piwinger wrote:

> David Relson wrote:
> 
> [bogoutil -H]
> > hapaxes:  ham  375505 (29.72%), spam  443797 (35.12%)
> >    pure:  ham  562881 (44.55%), spam  616022 (48.75%)
> 
> What is the meaning of pure? Tokens which have been seen
> only once for one category, but possibly many times in the
> other?

hapaxes have a total ham+spam count of 1.  "pure" indicates either ham
or spam is 0.  Given this, all hapaxes are "pure".  I'm open to
suggestions for better labels :-)

> BTW: The option is not in the man page.

Another detail to take care of :-<




More information about the Bogofilter mailing list