suggestions/requests

Eric S. Raymond esr at thyrsus.com
Thu Jan 23 00:07:24 CET 2003


Dew-Jones, Malcolm MSER:EX <Malcolm.DewJones at gems5.gov.bc.ca>:
> For each message you generate all appropriate pseudo words, and then add one
> 
> to erach pseudo word in the word list.  For example the pseudo word
> "document 
> length in bytes 101-500" would count the number of documents that fell in
> that 
> length range.

This is a good idea, but IMO it doesn't belong in bogofilter itself.
Bogofiter should stick to doing one thing -- Bayesian analysis of 
presented features -- and doing it well.

Your statistics should be gathered by a separate feature extractor which
feeds bogofilter.  I'm working on a framework for such tests now; it's
called `bogometer'.

Let's keep this project small and clean and lightweight, people.
-- 
		<a href="http://www.tuxedo.org/~esr/">Eric S. Raymond</a>




More information about the bogofilter-dev mailing list