在哪个顺序应该添加浮点数以获得最精确的结果？ [英] In which order should floats be added to get the most precise result?

查看：172 发布时间：2016/10/13 10:03:57 c++ floating-point precision floating-point-precision

本文介绍了在哪个顺序应该添加浮点数以获得最精确的结果？的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这是我在最近的采访中提出的一个问题，我想知道（我实际上并不记得数值分析的理论，所以请帮助我：）

如果我们有一些函数累积浮点数：

std :: accumulate v.begin（），v.end（），0.0）;

v 是 std :: vector< float> 。

？
我怀疑排序的数字会以升序排列使数字错误减少，但很遗憾，我无法自己证明。

PS
解决方案
你的本能是基本上正确，按升序排序（大小）通常会改善一些事情。考虑我们添加单精度（32位）浮点数的情况，有10亿个值等于1 /（10亿），一个值等于1.如果1是第一个，那么总和将来到1，因为1 +（1/10亿）是1，由于精度的损失。

如果小的值先到达，它们至少总计为一些东西，虽然我有2 ^ 30他们，而后2 ^ 25左右我回到的情况，每个人都不会影响总的任何更多。所以我还需要更多的技巧。

这是一个极端的情况，但是一般来说，添加两个相似大小的值比添加两个非常不同的值更精确因为你以较小的值舍弃较少的精度位数。通过对数字进行排序，您可以将相似大小的值组合在一起，通过以升序添加它们，您可以为小值提供累积达到更大数字大小的机会。

仍然，如果涉及负数，很容易outwit这种方法。考虑三个值， {1，-1，10亿th} 。算术正确的和是 10亿分之一，但如果我的第一次添加涉及到微小的值，那么我的最终和将是0.在6个可能的订单中，只有2个是正确的 - {1，-1十亿分之一} 和 { - 1,1亿亿分之一} 。所有6个订单提供的结果在输入中的最大幅度值（0.0000001％出）的尺度上是准确的，但是对于其中4个，结果在真实解（100％出）的尺度上是不准确的。你解决的特定问题会告诉你前者是否足够好。

事实上，你可以玩更多的技巧，而不仅仅是在排序订购。如果你有很多非常小的值，中间数量的中间值和一小部分大的值，那么可能最准确的是，首先将所有小的值相加，然后分别求出中间值，将这两个总和一起，然后添加大的。要找到最准确的浮点加法组合并不是件微不足道的事情，但是为了处理真正不好的情况，你可以保持不同大小的运行总数的整个数组，将每个新值添加到与其大小最匹配的总数中，并且当运行总量开始变得太大以至于其大小时，将其添加到下一个总计中并开始一个新的。考虑到它的逻辑极端，这个过程相当于执行任意精度类型的和（所以你会这样做）。但是考虑到以升序或降序的顺序添加的简单选择，上升是更好的赌注。

它与真实世界的编程有一些关系，有些情况下，你的计算可能会非常错误，如果你不小心切断了一个重尾部组成的大量的值，每个都太小，不能单独影响总和，或者如果你丢掉太多的精度从很多的小值，它们分别只影响和的最后几位。在尾巴可以忽略的情况下，你可能不在乎。例如，如果您只是首先将少量值添加在一起，并且只使用了总和的几个有效数字。

This was a question I was asked at my recent interview and I want to know (I don't actually remember the theory of the numerical analysis, so please help me :)

If we have some function, which accumulates floating-point numbers:

std::accumulate(v.begin(), v.end(), 0.0);

v is a std::vector<float>, for example.
- Would it be better to sort these numbers before accumulating them?
- Which order would give the most precise answer?
I suspect that sorting the numbers in ascending order would actually make the numerical error less, but unfortunately I can't prove it myself.

P.S. I do realize this probably has nothing to do with real world programming, just being curious.
解决方案
Your instinct is basically right, sorting in ascending order (of magnitude) usually improves things somewhat. Consider the case where we're adding single-precision (32 bit) floats, and there are 1 billion values equal to 1 / (1 billion), and one value equal to 1. If the 1 comes first, then the sum will come to 1, since 1 + (1 / 1 billion) is 1 due to loss of precision. Each addition has no effect at all on the total.

If the small values come first, they will at least sum to something, although even then I have 2^30 of them, whereas after 2^25 or so I'm back in the situation where each one individually isn't affecting the total any more. So I'm still going to need more tricks.

That's an extreme case, but in general adding two values of similar magnitude is more accurate than adding two values of very different magnitudes, since you "discard" fewer bits of precision in the smaller value that way. By sorting the numbers, you group values of similar magnitude together, and by adding them in ascending order you give the small values a "chance" of cumulatively reaching the magnitude of the bigger numbers.

Still, if negative numbers are involved it's easy to "outwit" this approach. Consider three values to sum, {1, -1, 1 billionth}. The arithmetically correct sum is 1 billionth, but if my first addition involves the tiny value then my final sum will be 0. Of the 6 possible orders, only 2 are "correct" - {1, -1, 1 billionth} and {-1, 1, 1 billionth}. All 6 orders give results that are accurate at the scale of the largest-magnitude value in the input (0.0000001% out), but for 4 of them the result is inaccurate at the scale of the true solution (100% out). The particular problem you're solving will tell you whether the former is good enough or not.

In fact, you can play a lot more tricks than just adding them in sorted order. If you have lots of very small values, a middle number of middling values, and a small number of large values, then it might be most accurate to first add up all the small ones, then separately total the middling ones, add those two totals together then add the large ones. It's not at all trivial to find the most accurate combination of floating-point additions, but to cope with really bad cases you can keep a whole array of running totals at different magnitudes, add each new value to the total that best matches its magnitude, and when a running total starts to get too big for its magnitude, add it into the next total up and start a new one. Taken to its logical extreme, this process is equivalent to performing the sum in an arbitrary-precision type (so you'd do that). But given the simplistic choice of adding in ascending or descending order of magnitude, ascending is the better bet.

It does have some relation to real-world programming, since there are some cases where your calculation can go very badly wrong if you accidentally chop off a "heavy" tail consisting of a large number of values each of which is too small to individually affect the sum, or if you throw away too much precision from a lot of small values that individually only affect the last few bits of the sum. In cases where the tail is negligible anyway you probably don't care. For example if you're only adding together a small number of values in the first place and you're only using a few significant figures of the sum.

这篇关于在哪个顺序应该添加浮点数以获得最精确的结果？的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在哪个顺序应该添加浮点数以获得最精确的结果？ [英] In which order should floats be added to get the most precise result?

问题描述

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

在哪个顺序应该添加浮点数以获得最精确的结果？ [英] In which order should floats be added to get the most precise result?

问题描述

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

登录关闭