[bogofilter] Improved Calculations

Boris 'pi' Piwinger 3.14 at piology.org
Thu May 13 09:27:15 CEST 2004


michael at optusnet.com.au wrote:

>> > Gary Robinson's blog has a new "Improved Chi" article (at
>> > http://www.garyrobinson.net/2004/04/improved_chi.html) which points to
>> > his new paper, "Handling Redundancy in Email Token Probabilities" (at
>> > http://garyrob.blogs.com//handlingtokenredundancy93.pdf).  
>> 
>> I read those. I have to say, that -- without knowing the
>> theory -- this is something I really don't understand. The
>> new parameters may have some intuition, but how they are
>> used is magic. That makes it hard to guess good values, as
>> the article explains, only testing seems to work.
>
>IMHO, at least the ham parameter should be computable from
>the informational entropy of english text.

Why English? I receive message in German and English,
occasionally in other languages. Certainly every language
would have such a parameter.

pi



More information about the Bogofilter mailing list