对于IEEE754 NaN值,所有比较返回假的理由是什么? [英] What is the rationale for all comparisons returning false for IEEE754 NaN values?

查看:186
本文介绍了对于IEEE754 NaN值,所有比较返回假的理由是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为什么NaN值的比较与所有其他值的行为不同?
也就是说,与运算符==,< =,> =,<>的所有比较,其中一个或两个值都是NaN,返回false,与所有其他值的行为相反。



我假设这在某种程度上简化了数值计算,但我找不到明确说明的原因,甚至在 Kahan的IEEE 754状态讲座,详细讨论了其他设计决策。



这种不正常的行为在进行简单的数据处理时导致麻烦。例如,当对记录列表w.r.t.进行排序时在C程序中的一些实值字段我需要写额外的代码来处理NaN作为最大元素,否则排序算法可能会变得混乱。





我同意,但这并不意味着正确的answer是false,
,而不是一个布尔值(NaB),幸运的是不存在。



所以选择返回true或假的在我看来是任意的,
并且对于一般的数据处理,如果它遵守通常的法则
(==的反思,< ==,>的三分法)是有利的,
避免依赖这些法律的数据结构变得混乱。



因此,我要求违反这些法律的一些具体优势,而不仅仅是哲学推理。 / p>

编辑2:
我想我现在明白为什么让NaN最大值是一个坏主意,上限。



NaN!= NaN可能需要避免检测循环中的收敛,例如

  while(x!= oldX){
oldX = x;
x = better_approximation(x);
}

但是最好通过比较绝对差值和一个小限制来写。

解决方案

我是IEEE的成员-754委员会,我将尝试帮助澄清一些事。



首先,浮点数不是实数,浮点运算不满足真实算术的公理。三分法不是真正算术的唯一属性,不适用于浮点数,也不是最重要的。例如:




  • 加成不是关联的。

  • 分销法不成立。 li>
  • 有没有倒数的浮点数。



我可以继续。不可能指定一个固定大小的算术类型来满足我们所知道和喜欢的真实算术的属性的全部。 754委员会必须决定弯曲或打破其中的一些。这是由一些很简单的原则指导:


  1. 当我们可以时,我们匹配真实算术的行为。

  2. 如果我们无法做到,我们会尝试将违规情况设为可预测,尽可能轻松地诊断。

你的评论这不意味着正确答案是假的,这是错误的。谓词(y 询问 y 是否小于 x 。如果 y 是NaN,则小于任何浮点值 x ,所以答案肯定是假的。



我提到三分法不适用于浮点值。但是,有一个类似的属性。 754-2008标准的第5.11条第2款:


四种互斥关系是可能的:小于,等于,大于和无序。当至少一个操作数是NaN时,出现最后一种情况。


至于编写额外的代码来处理NaNs,通常可以虽然不总是很容易),以这样的方式结构你的代码,NaNs正常下降,但这并不总是这样。如果不是,则可能需要一些额外的代码,但是代价闭包对浮点运算带来的便利是一个很小的代价。






附录:
许多评论者认为保持平等和三分法的反身性更有用,理由是采用NaN!= NaN似乎不保留任何熟悉的公理。我承认对这个观点有一些同情,所以我想我会再次访问这个答案,并提供一些更多的上下文。



我从与Kahan谈话的理解是NaN! = NaN源自两个实用注意事项:




  • x == y 应该尽可能相当于 x - y == 0 (除了是真实算法的定理之外,这使得比较的硬件实现更具空间效率,这是但是,注意,这是违反x = y =无穷大,所以它不是一个很大的理由,它本身可以合理地弯曲到 x - 更重要的是,没有 isnan()

  • code>在807算法中NaN形式化时的谓词;有必要为程序员提供一种方便有效的检测NaN值的方法,而不依赖于提供类似 isnan()的编程语言,这可能需要许多年。我将引用Kahan对此主题的写作:





摆脱NaNs,他们将作为无限的无用的CRAYs;一旦遇到一个,计算将最好停止,而不是继续无限期的无限期结论。这就是为什么NaNs上的一些操作必须提供非NaN结果。哪些操作? ...例外是C谓词x == x和x!= x,对于每个无限或有限数x,它们分别为1和0,但如果x不是数字(NaN),则为反向。这些在NaN和谓词IsNaN(x)缺乏单词的语言中提供NaN和数字之间唯一的简单的无意识的区别。


注意,这也是排除返回类似非A布尔的逻辑。也许这种实用主义被错误放置,标准应该需要 isnan(),但这将使NaN几乎不可能有效地方便地使用几年,而世界等待编程语言采用。我不相信这是一个合理的权衡。



要平淡:NaN == NaN的结果现在不会改变。更好地学会与它在一起生活,而不是在互联网上抱怨。如果你想争论一个适合于容器的订单关系应该存在,我建议提倡你最喜欢的编程语言实现 totalOrder 标准化在IEEE-754(2008)。事实上,它还没有说到Kahan的关注的有效性,激发了当前的事态。


Why do comparisons of NaN values behave differently from all other values? That is, all comparisons with the operators ==, <=, >=, <, > where one or both values is NaN returns false, contrary to the behaviour of all other values.

I suppose this simplifies numerical computations in some way, but I couldn't find an explicitly stated reason, not even in the Lecture Notes on the Status of IEEE 754 by Kahan which discusses other design decisions in detail.

This deviant behavior is causing trouble when doing simple data processing. For example, when sorting a list of records w.r.t. some real-valued field in a C program I need to write extra code to handle NaN as the maximal element, otherwise the sort algorithm could become confused.

Edit: The answers so far all argue that it is meaningless to compare NaNs.

I agree, but that doesn't mean that the correct answer is false, rather it would be a Not-a-Boolean (NaB), which fortunately doesn't exist.

So the choice of returning true or false for comparisons is in my view arbitrary, and for general data processing it would be advantageous if it obeyed the usual laws (reflexivity of ==, trichotomy of <, ==, >), lest data structures which rely on these laws become confused.

So I'm asking for some concrete advantage of breaking these laws, not just philosophical reasoning.

Edit 2: I think I understand now why making NaN maximal would be a bad idea, it would mess up the computation of upper limits.

NaN != NaN might be desirable to avoid detecting convergence in a loop such as

while (x != oldX) {
    oldX = x;
    x = better_approximation(x);
}

which however should better be written by comparing the absolute difference with a small limit. So IMHO this is a relatively weak argument for breaking reflexivity at NaN.

解决方案

I was a member of the IEEE-754 committee, I'll try to help clarify things a bit.

First off, floating-point numbers are not real numbers, and floating-point arithmetic does not satisfy the axioms of real arithmetic. Trichotomy is not the only property of real arithmetic that does not hold for floats, nor even the most important. For example:

  • Addition is not associative.
  • The distributive law does not hold.
  • There are floating-point numbers without inverses.

I could go on. It is not possible to specify a fixed-size arithmetic type that satisfies all of the properties of real arithmetic that we know and love. The 754 committee has to decide to bend or break some of them. This is guided by some pretty simple principles:

  1. When we can, we match the behavior of real arithmetic.
  2. When we can't, we try to make the violations as predictable and as easy to diagnose as possible.

Regarding your comment "that doesn't mean that the correct answer is false", this is wrong. The predicate (y < x) asks whether y is less than x. If y is NaN, then it is not less than any floating-point value x, so the answer is necessarily false.

I mentioned that trichotomy does not hold for floating-point values. However, there is a similar property that does hold. Clause 5.11, paragraph 2 of the 754-2008 standard:

Four mutually exclusive relations are possible: less than, equal, greater than, and unordered. The last case arises when at least one operand is NaN. Every NaN shall compare unordered with everything, including itself.

As far as writing extra code to handle NaNs goes, it is usually possible (though not always easy) to structure your code in such a way that NaNs fall through properly, but this is not always the case. When it isn't, some extra code may be necessary, but that's a small price to pay for the convenience that algebraic closure brought to floating-point arithmetic.


Addendum: Many commenters have argued that it would be more useful to preserve reflexivity of equality and trichotomy on the grounds that adopting NaN != NaN doesn’t seem to preserve any familiar axiom. I confess to having some sympathy for this viewpoint, so I thought I would revisit this answer and provide a bit more context.

My understanding from talking to Kahan is that NaN != NaN originated out of two pragmatic considerations:

  • that x == y should be equivalent to x - y == 0 whenever possible (beyond being a theorem of real arithmetic, this makes hardware implementation of comparison more space-efficient, which was of utmost importance at the time the standard was developed — note, however, that this is violated for x = y = infinity, so it’s not a great reason on its own; it could have reasonably been bent to x - y == 0 or NaN).

  • more importantly, there was no isnan( ) predicate at the time that NaN was formalized in the 8087 arithmetic; it was necessary to provide programmers with a convenient and efficient means of detecting NaN values that didn’t depend on programming languages providing something like isnan( ) which could take many years. I’ll quote Kahan’s own writing on the subject:

Were there no way to get rid of NaNs, they would be as useless as Indefinites on CRAYs; as soon as one were encountered, computation would be best stopped rather than continued for an indefinite time to an Indefinite conclusion. That is why some operations upon NaNs must deliver non-NaN results. Which operations? … The exceptions are C predicates " x == x " and " x != x ", which are respectively 1 and 0 for every infinite or finite number x but reverse if x is Not a Number ( NaN ); these provide the only simple unexceptional distinction between NaNs and numbers in languages that lack a word for NaN and a predicate IsNaN(x).

Note that this is also the logic that rules out returning something like a "Not-A-Boolean". Maybe this pragmatism was misplaced, and the standard should have required isnan( ), but that would have made NaN nearly impossible to use efficiently and conveniently for several years while the world waited for programming language adoption. I’m not convinced that would have been a reasonable tradeoff.

To be blunt: the result of NaN == NaN isn’t going to change now. Better to learn to live with it than to complain on the internet. If you want to argue that an order relation suitable for containers should also exist, I would recommend advocating that your favorite programming language implement the totalOrder predicate standardized in IEEE-754 (2008). The fact that it hasn’t already speaks to the validity of Kahan’s concern that motivated the current state of affairs.

这篇关于对于IEEE754 NaN值,所有比较返回假的理由是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆