significant digits?
David Relson
relson at osagesoftware.com
Thu Apr 10 02:14:50 CEST 2003
At 08:05 PM 4/9/03, Jim Correia wrote:
>On Wednesday, April 9, 2003, at 07:07 PM, David Relson wrote:
>
>>No. 0.95 converts internally to 0.9500000 (or something similar) and
>>bogofilter uses the converted value.
>
>Al values for floats and doubles can't be represented exactly in machine
>form, so it is probably something li 0.949999988...
>
>> Implementing rounding _could_ be done, but there's no point. If
>> someone wants 0.94966 to indicate spam, then they should specify a
>> spam_cutoff value.
>
>My argument is that there might very well be a point, depending on the
>statistical calculations (which I am not familiar with).
>
>If there really are only 2 significant digits in the spamicity
>calculation, then my message *should* have been classified as spam. If, on
>the other hand, the calculation really does have 5 significant digits,
>then it should not have been classified as spam.
>
>I'm not suggesting that the spamicity be arbitrarily rounded to 2 places,
>but rounded to the proper number of significant digits as the calculation
>dictates.
>
>(This is probably only a pedantic point. The probability of a spamicity
>calculation within such a narrow margin around the cutoff value is
>probably low?)
>
>Jim
Jim,
Calculations are done using variables of type "double", which varies
between architectures. On Intel, a double is 8 bytes (64 bits). According
to file float.h, a double has 15 decimal digits of precision. Bogofilter
uses as much as the hardware will give it.
David
More information about the Bogofilter
mailing list