偏差值和浮点指数的范围 [英] Bias value and range of the exponent of floating point

查看:67
本文介绍了偏差值和浮点指数的范围的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我意识到我坐在大学课桌时没有适当注意IEEE 754标准的浮点部分.但是,即使我目前还没有为嵌入式工作而苦苦挣扎,但由于缺乏某种数学计算方法并完全掌握标准,我仍然感到自己无能为力,也无力成为工程师称号.

I have realized that I didn't heed duly the floating-point part of IEEE 754 standard as sitting my university desks. However, even if I'm not currently struggling with embedded stuff, I feel incompetent myself and incapable of entitling to be engineer title for lack of some way of math-calculations and wholly grasping the standard.

我知道的是

  • 0 255 是表示 0 infinity 的特殊值值.

  • 0 and 255 are special values to express 0 and infinity values.

有一个隐式 1 用于将 23bit 表示为 24

There is implicit 1 to be used to express 23bit as 24

仅当 000 e 变为1,如果 111 且尾数为 0000 ,则它变为无穷大,并且如果它是 111 并且尾数是 XXXX ,那么它不是数字.

where e becomes 1 only if it's 000, if it's 111 and mantissa is 0000, then it's infinity, and if it's 111 and mantissa is XXXX, then it's not a number.

我不明白的是

  • 我们怎么能包括在内分别提及 -126 127 ?怎么可能将254个值划分为包含值?
  • 为什么选择 127 作为偏差值?
  • 一些资料来源将该节解释为 [-126..127] ,但 chux 进行了更正)
  • How can we mention -126 and 127, inclusively? How are total possible 254 values sectioned as the inclusive values?
  • Why is 127 selected as the bias value?
  • Some sources explain the sectionization as [-126..127] but some [-125...128]. It is really intricate and perplexing.
  • How can we say the minimum 2^{-126} if not the second aforementioned source? If it is 2^{-125} ? (I have not be able to run my brain to get it understand till now though struggling :)
  • Isn't using modulo remainder operator more logical with the bias value instead of subtraction i.e. 2^{e%127}? (the correction thanks to chux)

推荐答案

我们怎么能分别提及 -126 127 ?共有254个值如何划分为包含值?

How can we mention -126 and 127, inclusively? How are total possible 254 values sectioned as the inclusive values?

IEEE 754-2008 3.3表示 emin ,对于任何格式,最小指数应为1− emax ,其中 emax 是最大值指数.该子句中的表3.2表示32位格式(名为"binary32")的 emax 为127.因此 emin 为1−127 = -126.

IEEE 754-2008 3.3 says emin, the minimum exponent, for any format shall be 1−emax, where emax is the maximum exponent. Table 3.2 in that clause says emax for the 32-bit format (named "binary32") shall be 127. So emin is 1−127 = −126.

没有任何数学约束可以强制这样做.关系的选择是优先考虑的.我记得有一种渴望拥有比负数稍微多一些的正指数,但是不记得有这样做的理由.

There is no mathematical constraint that forces this. The relationship is chosen as a matter of preference. I recall there being a desire to have slightly more positive exponents than negative but do not recall the justification for this.

为什么选择 127 作为偏差值?

一旦选择了上述边界,则必须将其编码为8位(作为1-254代码,而将0和255保留为特殊代码)则是必需的值.

Once the bounds above are selected, 127 is necessarily the value needed to encode them in eight bits (as codes 1-254 while leaving 0 and 255 as special codes).

某些资料解释该分段为 [-126..127] ,但另一些 [-125 ... 128] .这确实是复杂而困惑的.

Some sources explain the sectionization as [-126..127] but some [-125...128]. It is really intricate and perplexing.

给出binary32的符号位S,八个指数位E(它们是数字 e 的二进制表示)和23个有效位F(它们是二进制)数字 f 的表示形式,并且给定0< e <255,则以下各项彼此等效:

Given bits of a binary32 that are the sign bit S, the eight exponent bits E (which are a binary representation of a number e), and the 23 significand bits F (which are a binary representation of a number f), and given 0 < e < 255, then the following are equivalent to each other:

  • 代表的数字是(−1) S •2 e −127 •(1+ f /2 23 ).
  • 代表的数字是(−1) S •2 e −127 •1.F 2
  • 代表的数字是(−1) S •2 e −126 •(1/2 + f /2 24 ).
  • 表示的数字是(-1) S •2 e -126 •.1F 2 .
  • 代表的数字是(−1) S •2 e −150 •(2 23 + f ).
  • 代表的数字是(-1) S •2 e −150 •1F. 2 .
  • The number represented is (−1)S • 2e−127 • (1+f/223).
  • The number represented is (−1)S • 2e−127 • 1.F2.
  • The number represented is (−1)S • 2e−126 • (½+f/224).
  • The number represented is (−1)S • 2e−126 • .1F2.
  • The number represented is (−1)S • 2e−150 • (223+f).
  • The number represented is (−1)S • 2e−150 • 1F.2.

前两个的区别只是前一个取有效位数F,将它们视为二进制数以得到数字 f ,然后将该数字除以2 23 并加1,而第二个则使用23个有效位F写入24位数字"1.F",然后将其解释为二进制数字.这两种方法产生相同的值.

The difference between the first two is just that the first takes the significand bits F, treats them as a binary numeral to get a number f, then divides that number by 223 and adds 1, whereas the second uses the 23 significand bits F to write a 24-bit numeral "1.F", which it then interprets as a binary numeral. These two methods produce the same value.

第一对和第二对之间的区别在于,第一对在半开区间[1/2,1)中准备一个有效数,而第二对在半开区间[½,1)中准备一个有效数,并且调整指数以进行补偿.产品是一样的.

The difference between the first pair and the second pair is that the first prepares a significand in the half-open interval [1, 2), whereas the second prepares a significand in the half-open interval [½, 1) and adjusts the exponent to compensate. The product is the same.

第一对和第三对之间的差异也是缩放比例之一.第三对缩放有效位数,使其为整数.第一种形式最常见于浮点数的讨论中,但是第三种形式对于数学证明很有用,因为数字理论通常适用于整数.在IEEE 754的第3.3节中也提到了这种形式.

The difference between the first pair and the third pair is also one of scaling. The third pair scales the significand so that it is an integer. The first form is most commonly seen in discussions of floating-point numbers, but the third form is useful for mathematical proofs because number theory generally works with integers. This form is also mentioned in IEEE 754 in passing, also in clause 3.3.

如果不是上述第二个来源,我们怎么说最小的 2 ^ {-126} ?如果是 2 ^ {-125} ?(尽管努力,但直到现在我仍无法动脑筋来理解它.)

How can we say the minimum 2^{-126} if not the second aforementioned source? If it is 2^{-125} ? (I have not be able to run my brain to get it understand till now though struggling :)

最小正法向值具有S位0,E位00000001和F位00000000000000000000000.在第一种形式中,它表示+1•2 1-127 •1 = 2 −126 .在第二种形式中,它表示+1•2 -1−126 •½= 2 −126 .在第三种形式中,它表示+1•2 1-150 •2 23 = 2 −126 .因此形式无关紧要;表示的值是相同的.

The minimum positive normal value has S bit 0, E bits 00000001, and F bits 00000000000000000000000. In the first form, this represents +1 • 21−127 • 1 = 2−126. In the second form, it represents +1 • 21−126 • ½ = 2−126. In the third form, it represents +1 • 21-150 • 223 = 2−126. So the form is irrelevant; the values represented are the same.

不是使用余数运算符对偏置值进行更合理的逻辑运算,而不是使用减法运算,即 2 ^ {e%127} ?

不.这将导致指数字段值1和128映射到相同的值,并且会浪费一些编码.这样做没有任何好处.

No. That would cause the exponent field values 1 and 128 to map to the same value, and that would waste some encodings. There is no benefit to that.

此外,编码格式应使所有正浮点数与它们的编码顺序相同:增加编码会增加表示的值,反之亦然.这种关系对于指数字段的任何形式的包装解释都不会陈旧.(不幸的是,这会翻转为负数,因此将浮点数的编码作为纯整数进行比较不会产生与比较浮点数相同的结果.)

Additionally, the encoding format is such that all positive floating-point numbers are in the same order as their encodings: Increasing the encoding increases the value represented, and vice-versa. This relationship would not old with any sort of wrapped interpretation of the exponent field. (Unfortunately, this is flipped for negative numbers, so compare the encodings of floating-point numbers as pure integers does not give the same results as comparing the floating-point numbers.)

这篇关于偏差值和浮点指数的范围的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆