一些浮点精度和数字限制问题 [英] Some floating point precision and numeric limits question

查看:234
本文介绍了一些浮点精度和数字限制问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道有这么多问题,但我找不到我的答案。请在投票结束前阅读(:





数字协处理器有八个浮点寄存器
每个寄存器保存80位数据
浮点数总是存储为这些寄存器中的80位
扩展精度数。

如果可能的话, sizeof 显示不同的东西,例如在x64架构上,sizeof double 是8,这是远离80位。




  • 为什么 std :: numeric_limits< long double > :: max()给我 1.18973e + 4932 ?!这是huuuuuuuuuuuuuu的号码,如果这不是获得max


  • 这意味着什么:

    / li>


双精度幅度范围约为10 ^ -308至10 ^ 308 

这些是巨大的数字,你不能将它们存储到8B或甚至16B(这是扩展精度,只有128bits)?



,我错过了一些东西。其实,显然,很多东西。

解决方案

1) sizeof 是内存中的大小,寄存器。 sizeof 以字节为单位,因此8个字节= 64位。当在存储器中计算双精度(在此架构上)时,它们获得额外的16位用于更精确的中间计算。

2)为什么你认为 long double 不能达到1.18973e + 4932?



3)为什么你不能存储10 ^ 308在8字节?我只需要13位:4存储10,和9存储308。


I know that there are tons of questions like this one, but I couldn't find my answers. Please read before voting to close (:

The numeric coprocessor has eight floating point registers. 
Each register holds 80 bits of data. 
Floating point numbers are always stored as 80-bit 
extended precision numbers in these registers.

How is that possible, when sizeof shows different things. For example, on x64 architecture, the sizeof double is 8 and this is far away from 80bits.

  • why does std::numeric_limits< long double >::max() gives me 1.18973e+4932 ?! This is huuuuuuuuuuge number. If this is not the way to get max of floating point numbers, then why this compiles at all, and even more - why does this returns a value.

  • what does this mean:

Double precision magnitudes can range from approximately 10^−308 to 10^308 

These are huge numbers, you cannot store them into 8B or even 16B (which is extended precision and it is only 128bits)?

Obviously, I'm missing something. Actually, obviously, a lot of things.

解决方案

1) sizeof is the size in memory, not in a register. sizeof is in bytes, so 8 bytes = 64 bits. When doubles are calculated in memory (on this architecture), they get an extra 16 bits for more precise intermediate calculations. When the value is copied back to memory, the extra 16 bits are lost.

2) Why do you think long double doesn't go up to 1.18973e+4932?

3) Why can't you store 10^308 in 8 bytes? I only need 13 bits: 4 to store the 10, and 9 to store the 308.

这篇关于一些浮点精度和数字限制问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆