应该按什么顺序添加浮点数以获得最精确的结果? [英] In which order should floats be added to get the most precise result?

查看：15 发布时间：2022/1/9 10:13:11 c++ floating-point precision

本文介绍了应该按什么顺序添加浮点数以获得最精确的结果?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这是我在最近的采访中被问到的一个问题，我想知道(我实际上不记得数值分析的理论，所以请帮助我:)

This was a question I was asked at my recent interview and I want to know (I don't actually remember the theory of the numerical analysis, so please help me :)

如果我们有一些函数，它会累积浮点数:

If we have some function, which accumulates floating-point numbers:

std::accumulate(v.begin(), v.end(), 0.0);

例如，

v 是 std::vector.

在累积这些数字之前对它们进行排序会更好吗?

哪个顺序会给出最准确的答案?

我怀疑按升序对数字进行排序实际上会使数字错误少，但不幸的是我自己无法证明.

I suspect that sorting the numbers in ascending order would actually make the numerical error less, but unfortunately I can't prove it myself.

附注我确实意识到这可能与现实世界的编程无关，只是好奇.

推荐答案

你的直觉基本上是对的，按升序(数量级)排序通常会有所改善.考虑我们添加单精度(32 位)浮点数的情况，有 10 亿个值等于 1/(10 亿)，一个值等于 1.如果 1 先出现，那么总和将出现为 1，因为 1 + (1/10 亿) 由于精度损失而为 1.每次添加对总数都没有影响.

Your instinct is basically right, sorting in ascending order (of magnitude) usually improves things somewhat. Consider the case where we're adding single-precision (32 bit) floats, and there are 1 billion values equal to 1 / (1 billion), and one value equal to 1. If the 1 comes first, then the sum will come to 1, since 1 + (1 / 1 billion) is 1 due to loss of precision. Each addition has no effect at all on the total.

如果小值先出现，它们至少会相加，尽管即使这样我也有 2^30 个，而在 2^25 左右之后，我又回到了每个单独不是的情况再影响总量.所以我仍然需要更多的技巧.

If the small values come first, they will at least sum to something, although even then I have 2^30 of them, whereas after 2^25 or so I'm back in the situation where each one individually isn't affecting the total any more. So I'm still going to need more tricks.

这是一个极端情况，但通常添加两个幅度相似的值比添加两个幅度非常不同的值更准确，因为您以这种方式在较小的值中丢弃"更少的精度位.通过对数字进行排序，您可以将大小相似的值组合在一起，并通过将它们按升序添加，您可以为较小的值提供累积达到较大数字大小的机会".

That's an extreme case, but in general adding two values of similar magnitude is more accurate than adding two values of very different magnitudes, since you "discard" fewer bits of precision in the smaller value that way. By sorting the numbers, you group values of similar magnitude together, and by adding them in ascending order you give the small values a "chance" of cumulatively reaching the magnitude of the bigger numbers.

不过，如果涉及负数，则很容易智取"这种方法.考虑三个相加的值，{1, -1, 1 billionth}.算术上正确的总和是 10 亿分之一，但如果我的第一次加法涉及微小的值，那么我的最终总和将为 0.在 6 个可能的顺序中，只有 2 个是正确的" - {1, -1, 10 亿} 和 {-1, 1, 10 亿}.所有 6 个阶数给出的结果在输入中的最大数值范围内是准确的(0.0000001% 出)，但其中 4 个的结果在真实解的尺度上是不准确的(100% 出).您正在解决的特定问题会告诉您前者是否足够好.

Still, if negative numbers are involved it's easy to "outwit" this approach. Consider three values to sum, {1, -1, 1 billionth}. The arithmetically correct sum is 1 billionth, but if my first addition involves the tiny value then my final sum will be 0. Of the 6 possible orders, only 2 are "correct" - {1, -1, 1 billionth} and {-1, 1, 1 billionth}. All 6 orders give results that are accurate at the scale of the largest-magnitude value in the input (0.0000001% out), but for 4 of them the result is inaccurate at the scale of the true solution (100% out). The particular problem you're solving will tell you whether the former is good enough or not.

实际上，您可以玩更多的技巧，而不仅仅是按排序顺序添加它们.如果您有很多非常小的值、中等数量的中等值和少量的大值，那么首先将所有小值相加，然后分别合计中等值，然后将这两个总数相加可能是最准确的一起然后添加大的.找到最准确的浮点加法组合并非易事，但要应对非常糟糕的情况，您可以将整个运行总计数组保持在不同的大小，将每个新值添加到与其大小最匹配的总数中，当一个连续的总数开始变得太大时，将其添加到下一个总数中并开始一个新的总数.从逻辑上讲，这个过程相当于以任意精度类型执行求和(所以你会这样做).但考虑到按数量级升序或降序添加的简单选择，升序是更好的选择.

In fact, you can play a lot more tricks than just adding them in sorted order. If you have lots of very small values, a middle number of middling values, and a small number of large values, then it might be most accurate to first add up all the small ones, then separately total the middling ones, add those two totals together then add the large ones. It's not at all trivial to find the most accurate combination of floating-point additions, but to cope with really bad cases you can keep a whole array of running totals at different magnitudes, add each new value to the total that best matches its magnitude, and when a running total starts to get too big for its magnitude, add it into the next total up and start a new one. Taken to its logical extreme, this process is equivalent to performing the sum in an arbitrary-precision type (so you'd do that). But given the simplistic choice of adding in ascending or descending order of magnitude, ascending is the better bet.

它确实与现实世界的编程有一些关系，因为在某些情况下，如果你不小心砍掉了由大量值组成的重"尾，每个值都太小了，你的计算可能会出现严重错误单独影响总和，或者如果您从许多仅影响总和的最后几位的小值中丢弃了太多的精度.在无论如何尾巴可以忽略不计的情况下，您可能不在乎.例如，如果您一开始只是将少量值相加，并且只使用总和的几个有效数字.

It does have some relation to real-world programming, since there are some cases where your calculation can go very badly wrong if you accidentally chop off a "heavy" tail consisting of a large number of values each of which is too small to individually affect the sum, or if you throw away too much precision from a lot of small values that individually only affect the last few bits of the sum. In cases where the tail is negligible anyway you probably don't care. For example if you're only adding together a small number of values in the first place and you're only using a few significant figures of the sum.

这篇关于应该按什么顺序添加浮点数以获得最精确的结果?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

应该按什么顺序添加浮点数以获得最精确的结果? [英] In which order should floats be added to get the most precise result?

问题描述

推荐答案

相关文章

C/C++开发最新文章

热门教程

热门工具

登录关闭

应该按什么顺序添加浮点数以获得最精确的结果? [英] In which order should floats be added to get the most precise result?

问题描述

推荐答案

相关文章

C/C++开发最新文章

热门教程

热门工具

登录 关闭

登录关闭