您可以在double中存储多少个数字并在c中进行浮点运算? [英] How big of a number can you store in double and float in c?

查看：71 发布时间：2021/5/8 19:54:53 c floating-point sizeof

本文介绍了您可以在double中存储多少个数字并在c中进行浮点运算?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我试图确切地算出可以用作浮点数和 double 的数字.但是除了整数值，它不存储我期望的方式. double 应该包含8个字节的信息，足以容纳变量a，但它并不正确.它显示 1234567890123456768 ，其中后两位数字不同.当我将 214783648 或最后一位中的任何数字存储在浮点变量 b 中时，它会显示相同的值 214783648 .这应该是极限.那是怎么回事?

I am trying to figure out exactly how big number I can use as floating point number and double. But it does not store the way I expected except integer value. double should hold 8 bytes of information which is enough to hold variable a, but it does not hold it right. It shows 1234567890123456768 in which last 2 digits are different. And when I stored 214783648 or any digit in the last digit in float variable b, it shows the same value 214783648. which is supposed to be the limit. So what's going on?

double a;
float b;
int c;
a = 1234567890123456789;
b = 2147483648;
c = 2147483647;
printf("Bytes of double: %d\n", sizeof(double));
printf("Bytes of integer: %d\n", sizeof(int));
printf("Bytes of float: %d\n", sizeof(float));

printf("\n");

printf("You can count up to %.0f in 4 bytes\n", pow(2,32));
printf("You can count up to %.0f with + or - sign in 4 bytes\n", pow(2,31));
printf("You can count up to %.0f in 4 bytes\n", pow(2,64));
printf("You can count up to %.0f with + or - sign in in 8 bytes\n", pow(2,63));

printf("\n");

printf("double number: %.0f\n", a);
printf("floating point: %.0f\n", b);
printf("integer: %d\n", c);

return 0;

推荐答案

以浮点类型可以存储的最大(有限)数是多少的问题的答案将是 FLT_MAX 或分别为 float 和 double 的 DBL_MAX .

The answer to the question of what is the largest (finite) number that can be stored in a floating point type would be FLT_MAX or DBL_MAX for float and double, respectively.

但是，这并不意味着该类型可以精确地表示每个较小的数字或整数(实际上甚至不能接近).

However, that doesn't mean that the type can precisely represent every smaller number or integer (in fact, not even close).

首先，您需要了解并非浮点数的所有位都是相等"的.浮点数具有指数(8位IEEE-754标准 float ，11位( double ))和尾数(分别在 float 和 double 中分别为23位和52位).通过将尾数(具有隐含的前导1位和二进制点)乘以2 ^指数(对指数进行归一化；不能直接使用其二进制值)来获得数字.还有一个单独的符号位，因此以下内容也适用于负数.

First you need to understand that not all bits of a floating point number are "equal". A floating point number has an exponent (in 8 bits IEEE-754 standard float, 11 bits in double), and a mantissa (23 and 52 bits in float, and double respectively). The number is obtained by multiplying the mantissa (which has an implied leading 1-bit and binary point) by 2^exponent (after normalizing the exponent; its binary value is not used directly). There is also a separate sign bit, so the following applies to negative numbers as well.

随着指数的变化，尾数连续值之间的距离也变化，即，指数越大，浮点数的连续可表示值越远.因此，您可能能够精确地存储给定大小的一个数字，但不能存储下一个"数字.还应该记住，某些看似简单的分数不能用任意数量的二进制数字精确表示(例如， 1/10 ，十分之一，是二进制中的无限重复序列，例如 1/3 (三分之一，十进制).

As the exponent changes, the distance between consecutive values of the mantissa changes as well, i.e., the greater the exponent, the further apart consecutive representable values of the floating point number are. Thus you may be able to store one number of a given magnitude precisely, but not the "next" number. One should also remember that some seemingly simple fractions can not be represented precisely with any number of binary digits (e.g., 1/10, one tenth, is an infinitely repeating sequence in binary, like 1/3, one third, is in decimal).

对于整数，您可以精确地表示每个整数，最大2 ^{mantissa_bits + 1}幅值.因此，IEEE-754 可以表示最大2 ²⁴的所有整数，以及 double 最多2 ⁵³的所有整数(在这些范围的最后一半中，连续的浮点值恰好相隔一个整数，因为整个尾数仅用于整数部分.可以表示单个较大的整数，但它们之间的间隔不止一个整数，即，您可以表示大于2 ^{mantissa_bits + 1}的 some 整数，但每个整数，直到该大小.

When it comes to integers, you can precisely represent every integer up to 2^{mantissa_bits + 1} magnitude. Thus an IEEE-754 float can represent all integers up to 2²⁴ and a double up to 2⁵³ (in the last half of these ranges the consecutive floating point values are exactly one integer apart, since the entire mantissa is used for the integer part only). There are individual larger integers that can be represented, but they are spaced more than one integer apart, i.e., you can represent some integers greater than 2^{mantissa_bits + 1} but every integer only up to that magnitude.

例如:

float f = powf(2.0f, 24.0f);
float f1 = f + 1.0f, f2 = f1 + 2.0f;
double d = pow(2.0, 53.0);
double d1 = d + 1.0, d2 = d + 2.0;
(void) printf("2**24 float = %.0f, +1 = %.0f, +2 = %.0f\n", f, f1, f2);
(void) printf("2**53 double = %.0f, +1 = %.0f, +2 = %.0f\n", d, d1, d2);

输出:

2**24 float = 16777216, +1 = 16777216, +2 = 16777218
2**53 double = 9007199254740992, +1 = 9007199254740992, +2 = 9007199254740994

如您所见，将 1 添加到2 ^{mantissa_bits + 1}并没有区别，因为结果无法表示，但是添加 2 可以产生正确的答案(碰巧的是，由于乘数加倍，因此在这个数量级上，可表示的数字相隔两个整数).

As you can see, adding 1 to 2^{mantissa_bits + 1} makes no difference since the result is not representable, but adding 2 does produce the correct answer (as it happens, at this magnitude the representable numbers are two integers apart since the multiplier has doubled).

TL; DR IEE-754 可以精确表示最多2 ²⁴和 double 最多2 ^{53的所有整数}，但只有一些更大的整数(可表示的值的间隔取决于大小).

TL;DR An IEE-754 float can precisely represent all integers up to 2²⁴ and double up to 2⁵³, but only some integers of greater magnitude (the spacing of representable values depends on the magnitude).

这篇关于您可以在double中存储多少个数字并在c中进行浮点运算?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

您可以在double中存储多少个数字并在c中进行浮点运算? [英] How big of a number can you store in double and float in c?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

您可以在double中存储多少个数字并在c中进行浮点运算? [英] How big of a number can you store in double and float in c?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭