significant digits?

David Relson relson at osagesoftware.com
Thu Apr 10 02:14:50 CEST 2003


At 08:05 PM 4/9/03, Jim Correia wrote:

>On Wednesday, April 9, 2003, at 07:07  PM, David Relson wrote:
>
>>No.  0.95 converts internally to 0.9500000 (or something similar) and 
>>bogofilter uses the converted value.
>
>Al values for floats and doubles can't be represented exactly in machine 
>form, so it is probably something li 0.949999988...
>
>>  Implementing rounding _could_ be done, but there's no point.  If 
>> someone wants 0.94966 to indicate spam, then they should specify a 
>> spam_cutoff value.
>
>My argument is that there might very well be a point, depending on the 
>statistical calculations (which I am not familiar with).
>
>If there really are only 2 significant digits in the spamicity 
>calculation, then my message *should* have been classified as spam. If, on 
>the other hand, the calculation really does have 5 significant digits, 
>then it should not have been classified as spam.
>
>I'm not suggesting that the spamicity be arbitrarily rounded to 2 places, 
>but rounded to the proper number of significant digits as the calculation 
>dictates.
>
>(This is probably only a pedantic point. The probability of a spamicity 
>calculation within such a narrow margin around the cutoff value is 
>probably low?)
>
>Jim

Jim,

Calculations are done using variables of type "double", which varies 
between architectures.  On Intel, a double is 8 bytes (64 bits).  According 
to file float.h, a double has 15 decimal digits of precision.  Bogofilter 
uses as much as the hardware will give it.

David






More information about the Bogofilter mailing list