对于 IEEE754 NaN 值返回 false 的所有比较的基本原理是什么? [英] What is the rationale for all comparisons returning false for IEEE754 NaN values?

查看:25
本文介绍了对于 IEEE754 NaN 值返回 false 的所有比较的基本原理是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为什么 NaN 值的比较与所有其他值的行为不同?也就是说,所有与运算符 ==、<=、>=、<、> 的比较(其中一个或两个值为 NaN)都返回 false,这与所有其他值的行为相反.

Why do comparisons of NaN values behave differently from all other values? That is, all comparisons with the operators ==, <=, >=, <, > where one or both values is NaN returns false, contrary to the behaviour of all other values.

我想这在某种程度上简化了数值计算,但我找不到明确说明的原因,即使在 Lecture Notes on the Status of IEEE 754 by Kahan,详细讨论了其他设计决策.

I suppose this simplifies numerical computations in some way, but I couldn't find an explicitly stated reason, not even in the Lecture Notes on the Status of IEEE 754 by Kahan which discusses other design decisions in detail.

这种异常行为在进行简单的数据处理时会造成麻烦.例如,当对记录列表进行排序时.C 程序中的一些实值字段我需要编写额外的代码来处理 NaN 作为最大元素,否则排序算法可能会变得混乱.

This deviant behavior is causing trouble when doing simple data processing. For example, when sorting a list of records w.r.t. some real-valued field in a C program I need to write extra code to handle NaN as the maximal element, otherwise the sort algorithm could become confused.

到目前为止的答案都认为比较 NaN 是没有意义的.

The answers so far all argue that it is meaningless to compare NaNs.

我同意,但这并不意味着正确答案是错误的,而是一个非布尔值 (NaB),幸运的是它不存在.

I agree, but that doesn't mean that the correct answer is false, rather it would be a Not-a-Boolean (NaB), which fortunately doesn't exist.

因此,在我看来,比较返回 true 或 false 的选择是任意的,对于一般数据处理,如果它遵守通常的法律将是有利的(==的自反性,<的三分法,==,>),以免依赖这些定律的数据结构变得混乱.

So the choice of returning true or false for comparisons is in my view arbitrary, and for general data processing it would be advantageous if it obeyed the usual laws (reflexivity of ==, trichotomy of <, ==, >), lest data structures which rely on these laws become confused.

所以我要求打破这些法律的一些具体优势,而不仅仅是哲学推理.

So I'm asking for some concrete advantage of breaking these laws, not just philosophical reasoning.

编辑 2:我想我现在明白为什么将 NaN 设为最大值是个坏主意,它会弄乱上限的计算.

Edit 2: I think I understand now why making NaN maximal would be a bad idea, it would mess up the computation of upper limits.

NaN != NaN 可能是可取的,以避免检测循环中的收敛,例如

NaN != NaN might be desirable to avoid detecting convergence in a loop such as

while (x != oldX) {
    oldX = x;
    x = better_approximation(x);
}

然而,最好通过将绝对差异与小限制进行比较来编写.所以恕我直言,这是在 NaN 打破反身性的一个相对较弱的论据.

which however should better be written by comparing the absolute difference with a small limit. So IMHO this is a relatively weak argument for breaking reflexivity at NaN.

推荐答案

我是 IEEE-754 委员会的成员,我会尽力帮助澄清一些事情.

I was a member of the IEEE-754 committee, I'll try to help clarify things a bit.

首先,浮点数不是实数,浮点运算不满足实数运算的公理.三分法不是真正算术的唯一属性,它不适用于浮点数,甚至不是最重要的属性.例如:

First off, floating-point numbers are not real numbers, and floating-point arithmetic does not satisfy the axioms of real arithmetic. Trichotomy is not the only property of real arithmetic that does not hold for floats, nor even the most important. For example:

  • 加法不是关联的.
  • 分配法则不成立.
  • 有没有倒数的浮点数.

我可以继续.不可能指定一个固定大小的算术类型来满足我们所知道和喜爱的实数算术的所有属性.754 委员会必须决定弯曲或破坏其中的一些.这是由一些非常简单的原则指导的:

I could go on. It is not possible to specify a fixed-size arithmetic type that satisfies all of the properties of real arithmetic that we know and love. The 754 committee has to decide to bend or break some of them. This is guided by some pretty simple principles:

  1. 我们尽可能匹配真实算术的行为.
  2. 如果不能,我们会尽量使违规行为可预测且易于诊断.

关于您的评论这并不意味着正确答案是错误的",这是错误的.谓词 (y < x) 询问 y 是否小于 x.如果y是NaN,那么它小于任何浮点值x,所以答案必然为假.

Regarding your comment "that doesn't mean that the correct answer is false", this is wrong. The predicate (y < x) asks whether y is less than x. If y is NaN, then it is not less than any floating-point value x, so the answer is necessarily false.

我提到三分法不适用于浮点值.但是,有一个类似的属性确实成立.754-2008 标准第 5.11 条第 2 款:

I mentioned that trichotomy does not hold for floating-point values. However, there is a similar property that does hold. Clause 5.11, paragraph 2 of the 754-2008 standard:

可能有四种互斥关系:小于、等于、大于和无序.最后一种情况出现在至少一个操作数是 NaN 时.每个 NaN 都应与包括自身在内的所有内容进行无序比较.

Four mutually exclusive relations are possible: less than, equal, greater than, and unordered. The last case arises when at least one operand is NaN. Every NaN shall compare unordered with everything, including itself.

就编写额外的代码来处理 NaN 而言,通常可以(尽管并不总是那么容易)以使 NaN 正确通过的方式构建代码,但情况并非总是如此.如果不是,则可能需要一些额外的代码,但对于代数闭包为浮点运算带来的便利性而言,这是一个很小的代价.

As far as writing extra code to handle NaNs goes, it is usually possible (though not always easy) to structure your code in such a way that NaNs fall through properly, but this is not always the case. When it isn't, some extra code may be necessary, but that's a small price to pay for the convenience that algebraic closure brought to floating-point arithmetic.

附录:许多评论者认为,保留平等和三分法的自反性会更有用,因为采用 NaN != NaN 似乎并没有保留任何熟悉的公理.我承认对这种观点有一些同情,所以我想我会重新审视这个答案并提供更多背景信息.

Addendum: Many commenters have argued that it would be more useful to preserve reflexivity of equality and trichotomy on the grounds that adopting NaN != NaN doesn’t seem to preserve any familiar axiom. I confess to having some sympathy for this viewpoint, so I thought I would revisit this answer and provide a bit more context.

我与 Kahan 交谈的理解是 NaN != NaN 源于两个务实的考虑:

My understanding from talking to Kahan is that NaN != NaN originated out of two pragmatic considerations:

  • 只要有可能,x == y 应该等价于 x - y == 0 (除了作为实数算术的定理之外,这使得硬件实现比较更节省空间,这在标准制定时是最重要的——但是请注意,这违反了 x = y = 无穷大,因此它本身并不是一个很好的理由;它本来可以合理地弯曲到 (x - y == 0) 或 (x 和 y 都是 NaN)).

  • That x == y should be equivalent to x - y == 0 whenever possible (beyond being a theorem of real arithmetic, this makes hardware implementation of comparison more space-efficient, which was of utmost importance at the time the standard was developed — note, however, that this is violated for x = y = infinity, so it’s not a great reason on its own; it could have reasonably been bent to (x - y == 0) or (x and y are both NaN)).

更重要的是,在 8087 算术中将 NaN 形式化的时候并没有 isnan( ) 谓词;有必要为程序员提供一种方便且有效的检测 NaN 值的方法,这种方法不依赖于提供可能需要多年时间的 isnan( ) 之类的编程语言.我将引用 Kahan 自己关于该主题的文章:

More importantly, there was no isnan( ) predicate at the time that NaN was formalized in the 8087 arithmetic; it was necessary to provide programmers with a convenient and efficient means of detecting NaN values that didn’t depend on programming languages providing something like isnan( ) which could take many years. I’ll quote Kahan’s own writing on the subject:

如果没有办法摆脱 NaN,它们将与 CRAY 上的 Indefinites 一样无用;一旦遇到,最好立即停止计算,而不是无限期地继续下去,得出一个无限期的结论.这就是为什么对 NaN 的某些操作必须提供非 NaN 结果的原因.哪些操作?...例外是 C 谓词x == x"和x!= x",对于每个无限或有限数 x,它们分别为 1 和 0,但如果 x 不是数字 (NaN),则相反;在缺少 NaN 词和谓词 IsNaN(x) 的语言中,这些提供了 NaN 和数字之间唯一简单的无异常区别.

Were there no way to get rid of NaNs, they would be as useless as Indefinites on CRAYs; as soon as one were encountered, computation would be best stopped rather than continued for an indefinite time to an Indefinite conclusion. That is why some operations upon NaNs must deliver non-NaN results. Which operations? … The exceptions are C predicates " x == x " and " x != x ", which are respectively 1 and 0 for every infinite or finite number x but reverse if x is Not a Number ( NaN ); these provide the only simple unexceptional distinction between NaNs and numbers in languages that lack a word for NaN and a predicate IsNaN(x).

请注意,这也是排除返回非布尔值"之类的逻辑的逻辑.也许这种实用主义是错误的,标准应该要求 isnan( ),但这会使 NaN 在几年内几乎不可能高效和方便地使用,而世界都在等待编程语言的采用.我不相信这是一个合理的权衡.

Note that this is also the logic that rules out returning something like a "Not-A-Boolean". Maybe this pragmatism was misplaced, and the standard should have required isnan( ), but that would have made NaN nearly impossible to use efficiently and conveniently for several years while the world waited for programming language adoption. I’m not convinced that would have been a reasonable tradeoff.

坦率地说:NaN == NaN 的结果现在不会改变.与其在互联网上抱怨,不如学会忍受它.如果您想争论适合容器的顺序关系也应该存在,我建议您主张您最喜欢的编程语言实现 IEEE-754 (2008) 中标准化的 totalOrder 谓词).事实上,它还没有说明 Kahan 引起当前事态的担忧的有效性.

To be blunt: the result of NaN == NaN isn’t going to change now. Better to learn to live with it than to complain on the internet. If you want to argue that an order relation suitable for containers should also exist, I would recommend advocating that your favorite programming language implement the totalOrder predicate standardized in IEEE-754 (2008). The fact that it hasn’t already speaks to the validity of Kahan’s concern that motivated the current state of affairs.

这篇关于对于 IEEE754 NaN 值返回 false 的所有比较的基本原理是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆