IEEE Std 754浮点：让t：= a - b，标准是否保证a == b + t？ [英] IEEE Std 754 Floating-Point: let t := a - b, does the standard guarantee that a == b + t?

查看：142 发布时间：2017/12/21 21:53:45 c++ c floating-point ieee-754

本文介绍了IEEE Std 754浮点：让t：= a - b，标准是否保证a == b + t？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

假设 t ， a ， b 是所有双重（IEEE Std 754）变量，并且 a ， b 的值都不是 NaN （但可能是 Inf ）。
在 t = a - b 之后，是否必须有 a == b + t ？

解决方案

绝对不是。一个明显的例子是 a = DBL_MAX ， b = -DBL_MAX 。那么 t = INFINITY ，所以 b + t 也是 INFINITY 。

更令人惊讶的是，有些情况下这种情况不会发生溢出。基本上，它们都是 a-b 不精确的形式。例如，如果 a 是 DBL_EPSILON / 4 并且 b 是 -1 ， ab 是1（假设默认舍入模式）， a-b + b 是0。

我提到第二个例子的原因是这是迫使四舍五入达到IEEE算术的特定精度。例如，如果你有一个范围在[0,1]的数字，并且想要强制将它舍入到4位的精度，你可以添加然后减去 0x1p49 。

Assume that t,a,b are all double (IEEE Std 754) variables, and both values of a, b are NOT NaN (but may be Inf). After t = a - b, do I necessarily have a == b + t?

解决方案

Absolutely not. One obvious case is a=DBL_MAX, b=-DBL_MAX. Then t=INFINITY, so b+t is also INFINITY.

What may be more surprising is that there are cases where this happens without any overflow. Basically, they're all of the form where a-b is inexact. For example, if a is DBL_EPSILON/4 and b is -1, a-b is 1 (assuming default rounding mode), and a-b+b is then 0.

The reason I mention this second example is that this is the canonical way of forcing rounding to a particular precision in IEEE arithmetic. For instance, if you have a number in the range [0,1) and want to force rounding it to 4 bits of precision, you would add and then subtract 0x1p49.

这篇关于IEEE Std 754浮点：让t：= a - b，标准是否保证a == b + t？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

IEEE Std 754浮点：让t：= a - b，标准是否保证a == b + t？ [英] IEEE Std 754 Floating-Point: let t := a - b, does the standard guarantee that a == b + t?

问题描述

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

IEEE Std 754浮点：让t：= a - b，标准是否保证a == b + t？ [英] IEEE Std 754 Floating-Point: let t := a - b, does the standard guarantee that a == b + t?

问题描述

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

登录关闭