如何计算双精度浮点数 [英] How to Calculate Double + Float Precision
问题描述
-3.402823e38 .. 3.402823e38和-1.79769313486232e308 .. 1.79769313486232e308。
对于int32,你可以做2 ^ 32 = 4294967296/2,你可以得到-2147483648到2147483647的范围。那么如何计算float和double的精度数。我想我正在寻找错误的条款,因为没有任何地方出现。
好吧,这两种类型实际上看起来如下所示:
[sign] [exponent] [mantissa]
代表一个数字以下面的形式:
[符号] 1. [尾数]×2 [指数]
,指数和尾数的大小是变化的。对于 显然这些数字并不完全我们期望的那些。这是尾数发挥作用的地方。尾数表示一个标准化的二进制数,总是以1开头(这是标准化的部分)。其余的只是点之后的数字。因为最大的尾数是大约1.111111111 ... 二进制,它几乎是2,所以我们将得到大约3.4×1038像 正如马克指出下面(和下面的问题),确切的公式如下: / p> 其中 e 是指数中的位数,而 p 是尾数中的位数,包括上述隐式位(由于归一化)。该公式复制了我们上面看到的,只有现在是准确的。第一个因素,2 e − 1是最大指数,乘以2(然后我们在第二个因子中保存两个)。第二个因素是我们可以表示的最大数字 。我上面说过,这个数字几乎是两个。由于我们在这个公式中将指数夸大了两倍,所以我们需要考虑这个因素,现在有一个几乎是一个的数字。无论如何,对于 [/编辑] 另外一件事:在你的问题中提出精确度一词,但是你引用了这些类型的范围。精度是一个完全不同的东西,指的是多少有效数字类型可以保留。再次,这里的答案在于 所以 I have been trying to find how to calculate the Floating/Double precision/range numbers
-3.402823e38 .. 3.402823e38 and -1.79769313486232e308 .. 1.79769313486232e308. For int32 you would do 2^32=4294967296/2 you get a range of -2147483648 to 2147483647. So how do i figure out the precision numbers for float and double. I think i am searching the wrong terms since nothing is coming up anywhere. Well, both types actually look like the following: representing a number in the following form: [sign] 1.[mantissa] × 2[exponent] with the size of the exponent and mantissa varying. For The exponent is the exponent for 2something so when calculating 2127 you'll get 1.7 × 1038 which gets you in the approximate range of the Obviously those numbers are not exactly those we expect. This is where the mantissa comes into play. The mantissa represents a normalized binary number that always begins with "1." (that's the normalized part). The rest is simply the digits after the dot. Since the maximum mantissa is then roughly 1.111111111... in binary, which is almost 2, we'll get approximately 3.4 × 1038 as [EDIT 2011-01-06] As Mark points out below (and below the question), the exact formula is the following: where e is the number of bits in the exponent and p is the number of bits in the mantissa, including the aforementioned implicit bit (due to normalization). The formula replicates what we have seen above, only now accurate. The first factor, 22e − 1, is the maximum exponent, multiplied by two (we save the two in the second factor then). The second factor is the largest number we can represent below one. I said above that the number is almost two. Since we exaggerated the exponent by a factor of two in this formula, we need to account for that and now have a number that is almost one. I hope it's not too confusing. In any case, for [/EDIT] Another thing: You're bringing up the term "precision" in your question but you quote the ranges of the types. Precision is a quite different thing and refers to how many significant digits the type can retain. Again, the answer here lies in the mantissa which is 23 and 52 bits for So the very last digit in the 这篇关于如何计算双精度浮点数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋! float
指数是8位宽,而 double
有11位指数。此外,指数存储为unsigned,其中 float
为127, float
和1022到1023来说,这个结果的范围是从&的减号126到127; double
指数是2 的指数,所以当计算2 <127>时,你将得到1.7×10 38 可以让你在 float
最大值的大概范围内。类似地,对于 double
与9×10 307 。
float
的最大值和1.79×10 308 作为 double
的最大值。
浮动
(用 e > = 8和 p = 24),我们得到确切的值340282346638528859811704183484516925440或大致3.4×1038。 双
然后产率(以ë = 10和 P = 53)179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368或大致1.80×10 308
float
和 double
的23和52位的尾数。由于数字被存储了规范化,所以我们实际上已经添加了一个隐含的位,这使我们在24位和53位。现在,小数点后面的数字(或二进制)点的工作方式如下:
1. 1 0 1 1
↑↑↑↑
2 ^ 0 2 ^ -1 2 ^ -2 2 ^ -3 2 ^ -4
= = = =
1 0.5 0.25 0.125 0.0625
double
尾数表示的值大致为2.2×10 -16;或2〜52;因此如果指数为1,则这是我们可以添加到数的最小值 - 将 double
精度放在16位十进制数字左右。同样,对于浮点
约七位数字。[sign] [exponent] [mantissa]
float
the exponent is eight bits wide, while double
has an eleven-bit exponent. Furthermore, the exponent is stored unsigned with a bias which is 127 for float
and 1023 for double
. This results in a range for the exponent of −126 through 127 for float
and −1022 though 1023 for double
.float
maximum value. Similarly for double
with 9 × 10307.float
's maximum value and 1.79 × 10308 as the maximum value for double
.float
(with e = 8 and p = 24) we get the exact value 340282346638528859811704183484516925440 or roughly 3.4 × 1038. double
then yields (with e = 10 and p = 53) 179769313486231570814527423731704356798070567525844996598917476803157260780028538760589558632766878171540458953514382464234321326889464182768467546703537516986049910576551282076245490090389328944075868508455133942304583236903222948165808559332123348274797826204144723168738177180919299881250404026184124858368 or roughly 1.80 × 10308.float
and double
, respectively. Since the numbers are stored normalized we actually have an implicit bit added to that, which puts us at 24 and 53 bits. Now, the way how digits after the decimal (or binary here) point work is the following: 1. 1 0 1 1
↑ ↑ ↑ ↑ ↑
2^0 2^-1 2^-2 2^-3 2^-4
= = = = =
1 0.5 0.25 0.125 0.0625
double
mantissa represents a value of roughly 2.2 × 10−16 or 2−52, so if the exponent is 1, this is the smallest value we can add to the number – placing the double
precision around 16 decimal digits. Likewise for float
with roughly seven digits.