双precision奇怪的行为。需要解释 [英] Double precision strange behaviour. Need an explanation

查看:209
本文介绍了双precision奇怪的行为。需要解释的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这里的code:

#include <stdio.h>
#include <math.h>

static double const x = 665857;
static double const y = 470832;

int main(){
    double z = x*x*x*x -(y*y*y*y*4+y*y*4);
    printf("%f \n",z);
    return 0;
}

令人不解的(我)这个code打印0.0,如果用GCC 4.6在32位机器(或者在我的情况下在64位机器上的-m32标志等)编译。据我了解浮点运算,有可能溢/下溢他们或失去precision他们,但是...... 0?怎么样?

Mysteriously (to me) this code prints "0.0" if compiled on 32 bits machines (or with the -m32 flag on 64 bits machines like in my case) with GCC 4.6. As far as I know about floating point operations, it is possible to overflow/underflow them or to lose precision with them, but... a 0? How?

先谢谢了。

推荐答案

这是这样的IEEE 754重新presents浮点数以标准化形式的结果。 float或double或任何其他符合IEEE 754的重presentation存储这样的:

This is result of the way IEEE 754 represents floating point numbers in normalized form. float or double or whatever other IEEE 754 compliant representation is stored like:

1.xxxxxxxxxxxxxxxxxxx * 2^exp

,其中 xxxxxxxxxxxxxxxxxxx 是尾数的小数部分,以便尾数本身总是在范围[1,2)。这始终是1的整数部分没有被存储在重新presentation。 X 的位数定义了precision。这是 52位双。该指数存储在偏移形式(一个必须减去1023,以获得它的值),但现在认为是无关紧要的。

where xxxxxxxxxxxxxxxxxxx is the fractional part of the mantissa so the mantissa itself is always in the range [1, 2). The integer part which is always 1 is not stored in the representation. The number of x bits defines the precision. It is 52 bits for the double. The exponent is stored in an offset form (one must subtract 1023 in order to obtain its value) but that is irrelevant now.

665857 ^ 4,在64位IEEE 754是:

665857^4 in 64-bit IEEE 754 is:

0 10001001100 (1)0100110100000001111100111011101010000101110010100010
+ exponent    mantissa

(第一比特是符号位:0 =阳性,1 - 阴性;括号中的位是不是真的存储)

(the first bit is the sign bit: 0 = positive, 1 - negative; the bit in parentheses is not really stored)

在80位x86扩展precision是:

In 80-bit x86 extended precision it is:

0 10001001100    (1)0100110100000001111100111011101010000101110010100010
0 100000001001100 1 010011010000000111110011101110101000010111001010000111000111011

(这里的整数部分是明确的再presentation的一部分 - 从IEEE 754的偏差,我已经对准了清晰的尾数)

(here the integer part is explicitly part of the representation - a deviation from IEEE 754; I've aligned the mantissas for clarity)

4 * 470832 ^ 4的64位IEEE 754和80位x86扩展precision是:

4*470832^4 in 64-bit IEEE 754 and 80-bit x86 extended precision is:

0 10001001100    (1)0100110100000001111100111011101001111111010101100111
0 100000001001100 1 010011010000000111110011101110100111111101010110011100100010000

4 * 470832 ^ 2的64位IEEE 754和80位x86扩展precision是:

4*470832^2 in 64-bit IEEE 754 and 80-bit x86 extended precision is:

0 10000100110    (1)1001110011101010100101010100100000000000000000000000
0 100000000100110 1 100111001110101010010101010010000000000000000000000000000000000

在总结最后两个数字,过程如下:将较小的值有其指数调整而尾数向右移动以preserve的值相匹配的较大的值的指数。由于两个指数由38不同,较小数目的尾数移位38位到右侧:

When you sum up the last two numbers, the procedure is the following: the smaller value has its exponent adjusted to match the larger value's exponent while the mantissa is shifted to the right in order to preserve the value. Since the two exponents differ by 38, the mantissa of the smaller number is shifted 38 bits to the right:

470832 ^ 2 *调整后的64位IEEE 754和80位x86扩展precision 4:

470832^2*4 in adjusted 64-bit IEEE 754 and 80-bit x86 extended precision:

 this bit came from 1.xxxx ------------------------------v
0 10001001100    (0)0000000000000000000000000000000000000110011100111010|1010
0 100000001001100 0 0000000000000000000000000000000000000110011100111010101001010101

现在数量都具有相同的指数和尾数可以概括:

Now both quantities have the same exponents and their mantissas could be summed:

0 10001001100 (1)0100110100000001111100111011101001111111010101100111|0010
0 10001001100 (0)0000000000000000000000000000000000000110011100111010|1010
--------------------------------------------------------------------------
0 10001001100 (1)0100110100000001111100111011101010000101110010100001|1100

我保留了一些在栏右侧的80位precision位,因为内部总和在80位的更大precision完成。

I kept some of the 80-bit precision bits on the right of the bar, because the summation internally is done in the greater precision of 80 bits.

现在让我们执行64位减法+ 80位代表的某些位:

Now let's perform the subtraction in 64-bit + some bits of the 80-bit rep:

minuend    0 10001001100 (1)0100110100000001111100111011101010000101110010100001|1100
subtrahend 0 10001001100 (1)0100110100000001111100111011101010000101110010100001|1100
-------------------------------------------------------------------------------------
difference 0 10001001100 (0)0000000000000000000000000000000000000000000000000000|0000

一个纯0!如果执行完全80位的计算,你将再次获得纯0。

A pure 0! If you perform the calculations in full 80-bit, you would once again obtain a pure 0.

这里真正的问题是,1.0不能再在64位precision psented以2 ^ 77指数$ P $ - 有尾数precision没有77位。这也是对80位precision真 - 只有63中尾数位,比所需更少的14比特重新给出的指数present 1.0 2 ^ 77

The real problem here is that 1.0 cannot be represented in 64-bit precision with an exponent of 2^77 - there are no 77 bits of precision in the mantissa. This is also true for the 80-bit precision - there are only 63 bits in the mantissa, 14 bits less than necessary to represent 1.0 given an exponent of 2^77.

原来是这样!这只是科学计算的奇妙世界里,没有什么工作,你被教导在数学课的方式...

So that's it! It's just the wonderful world of scientific computing where nothing works the way you were taught in the math classes...

这篇关于双precision奇怪的行为。需要解释的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆