在16,32和64位IEEE-754系统中可以表示的范围是什么? [英] What range of numbers can be represented in a 16-, 32- and 64-bit IEEE-754 systems?

查看:538
本文介绍了在16,32和64位IEEE-754系统中可以表示的范围是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道一点关于如何表示浮点数,但恐怕还不够。



总的问题是:


对于一个给定的精度(对于我来说,基数为10的精确小数位数),可以表示16,32 - 和64位IEEE-754系统?

具体来说,我只对16位和32位系统感兴趣,位数精确到+/- 0.5(一个地方)或+/- 0.0005(千分之一位置)。

解决方案

从MATLAB文档中为函数绘制答案 EPS a>,但它应该普遍适用于 IEEE-754 浮点数。



对于给定的浮点数 X ,如果

  2 ^ E <= abs(X) 2 ^(E + 1)

然后从 X 到下一个最大的可表示的浮点数( epsilon )是:

  epsilon = 2 ^(E-52 )%对于64位浮点数(双精度)
epsilon = 2 ^(E-23)%对于32位浮点数(单精度)
epsilon = 2 ^(E-10)对于16位浮点数(半精度)

上面的等式允许我们计算如下: / p>


  • 对于半精度 ...



    如果您需要+/- 0.5(或2 ^ -1)的准确度,最大的数量可以是2 ^ 10。任何大于这个的值和浮点数之间的距离大于0.5。

    如果你想得到+/- 0.0005(大约2 ^ -11)的精度,该数字的最大大小是1.任何大于这个值的浮点数和大于0.0005之间的距离。


  • 对于单精度 ...



    如果您想要+/- 0.5(或2 ^ -1)的精确度,则数字的最大大小为2 ^ 23。任何大于这个的值和浮点数之间的距离大于0.5。

    如果你想得到+/- 0.0005(大约2 ^ -11)的精度,最大的数量可以是2 ^ 13。任何大于这个的值和浮点数之间的距离大于0.0005。

  • .org / wiki / Double_precisionrel =noreferrer> double precision ...



    如果您想要+/- 0.5(或2 ^ -1),最大可以是2 ^ 52。任何大于这个的值和浮点数之间的距离大于0.5。

    如果你想得到+/- 0.0005(大约2 ^ -11)的精度,最大的数量可以是2 ^ 42。任何大于这个和浮点数之间的距离大于0.0005。


I know a little bit about how floating-point numbers are represented, but not enough, I'm afraid.

The general question is:

For a given precision (for my purposes, the number of accurate decimal places in base 10), what range of numbers can be represented for 16-, 32- and 64-bit IEEE-754 systems?

Specifically, I'm only interested in the range of 16-bit and 32-bit numbers accurate to +/-0.5 (the ones place) or +/- 0.0005 (the thousandths place).

解决方案

I'm drawing this answer from the MATLAB documentation for the function EPS, but it should apply universally to IEEE-754 floating point numbers.

For a given floating point number X, if

2^E <= abs(X) < 2^(E+1)

then the distance from X to the next largest representable floating point number (epsilon) is:

epsilon = 2^(E-52)    % For a 64-bit float (double precision)
epsilon = 2^(E-23)    % For a 32-bit float (single precision)
epsilon = 2^(E-10)    % For a 16-bit float (half precision)

The above equations allow us to compute the following:

  • For half precision...

    If you want an accuracy of +/-0.5 (or 2^-1), the maximum size that the number can be is 2^10. Any larger than this and the distance between floating point numbers is greater than 0.5.

    If you want an accuracy of +/-0.0005 (about 2^-11), the maximum size that the number can be is 1. Any larger than this and the distance between floating point numbers is greater than 0.0005.

  • For single precision...

    If you want an accuracy of +/-0.5 (or 2^-1), the maximum size that the number can be is 2^23. Any larger than this and the distance between floating point numbers is greater than 0.5.

    If you want an accuracy of +/-0.0005 (about 2^-11), the maximum size that the number can be is 2^13. Any larger than this and the distance between floating point numbers is greater than 0.0005.

  • For double precision...

    If you want an accuracy of +/-0.5 (or 2^-1), the maximum size that the number can be is 2^52. Any larger than this and the distance between floating point numbers is greater than 0.5.

    If you want an accuracy of +/-0.0005 (about 2^-11), the maximum size that the number can be is 2^42. Any larger than this and the distance between floating point numbers is greater than 0.0005.

这篇关于在16,32和64位IEEE-754系统中可以表示的范围是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆