为什么uint64_t无法正确显示pow(2,64)-1? [英] Why uint64_t cannot show pow(2, 64) - 1 properly?
问题描述
我试图理解为什么 uint64_t
类型不能正确显示 pow(2,64)-1
。 cplusplus标准为199711L。
I'm trying to understand why uint64_t
type can not show pow(2,64)-1
properly. The cplusplus standard is 199711L.
我在C ++ 98标准下检查了 pow()
函数,该函数为
I checked the pow()
function under C++98 standard which is
double pow (double base , double exponent);
float pow (float base , float exponent);
long double pow (long double base, long double exponent);
double pow (double base , int exponent);
long double pow (long double base, int exponent);
所以我写了以下代码段
double max1 = (pow(2, 64) - 1);
cout << max1 << endl;
uint64_t max2 = (pow(2, 64) - 1);
cout << max2 << endl;
uint64_t max3 = -1;
cout << max3 << endl;
输出为:
max1: 1.84467e+019
max2: 9223372036854775808
max3: 18446744073709551615
推荐答案
浮点数具有有限的精度。
Floating point numbers have finite precision.
在系统上(通常,假设使用binary64 IEEE-754)格式) 18446744073709551615
不是以 double
格式表示的数字。确实可以表示的最接近的数字恰好是 18446744073709551616
。
On your system (and typically, assuming binary64 IEEE-754 format) 18446744073709551615
is not a number that has a representation in the double
format. The closest number that does have a representation happens to be 18446744073709551616
.
将两个浮点数相减(并加)在一起数量巨大的差异通常会产生误差。相对于较小的操作数,此错误可能很重要。对于 18446744073709551616。 -1.-> 18446744073709551616。
减法的错误为1,实际上与较小的操作数相同。
Subtracting (and adding) together two floating point numbers of wildly different magnitudes usually produces an error. This error can be significant in relation to the smaller operand. In the case of 18446744073709551616. - 1. -> 18446744073709551616.
the error of the subtraction is 1, which is in fact the same value as the smaller operand.
当浮点值为转换为整数类型,并且该值不能适合整数类型,因此程序的行为是不确定的-即使整数类型是无符号的。
When a floating point value is converted to an integer type, and the value cannot fit in the integer type, the behaviour of the program is undefined - even when the integer type is unsigned.
这篇关于为什么uint64_t无法正确显示pow(2,64)-1?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!