添加零时奇怪的numpy.sum行为 [英] Weird numpy.sum behavior when adding zeros

查看:94
本文介绍了添加零时奇怪的numpy.sum行为的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我了解由于数值误差(例如,以不同顺序对浮点求和),在数学上等效的算术运算如何导致不同的结果.

I understand how mathematically-equivalent arithmentic operations can result in different results due to numerical errors (e.g. summing floats in different orders).

但是,令我惊讶的是,在sum上添加零会改变结果.我认为无论哪种情况,浮点数始终适用:x + 0. == x.

However, it surprises me that adding zeros to sum can change the result. I thought that this always holds for floats, no matter what: x + 0. == x.

这是一个例子.我希望所有行都完全为零.有人可以解释为什么会这样吗?

Here's an example. I expected all the lines to be exactly zero. Can anybody please explain why this happens?

M = 4  # number of random values
Z = 4  # number of additional zeros
for i in range(20):
    a = np.random.rand(M)
    b = np.zeros(M+Z)
    b[:M] = a
    print a.sum() - b.sum()

-4.4408920985e-16
0.0
0.0
0.0
4.4408920985e-16
0.0
-4.4408920985e-16
0.0
0.0
0.0
0.0
0.0
0.0
0.0
0.0
2.22044604925e-16
0.0
4.4408920985e-16
4.4408920985e-16
0.0

对于较小的MZ值似乎不会发生.

It seems not to happen for smaller values of M and Z.

我还确保了a.dtype==b.dtype.

这里是另一个示例,它还演示了python的内置sum的行为符合预期:

Here is one more example, which also demonstrates python's builtin sum behaves as expected:

a = np.array([0.1,      1.0/3,      1.0/7,      1.0/13, 1.0/23])
b = np.array([0.1, 0.0, 1.0/3, 0.0, 1.0/7, 0.0, 1.0/13, 1.0/23])
print a.sum() - b.sum()
=> -1.11022302463e-16
print sum(a) - sum(b)
=> 0.0

我正在使用numpy V1.9.2.

I'm using numpy V1.9.2.

推荐答案

简短答案:您正在看到两者之间的区别

Short answer: You are seeing the difference between

a + b + c + d

(a + b) + (c + d)

由于浮点错误而导致的结果不相同.

which because of floating point inaccuracies is not the same.

长答案: NumPy实现了成对求和,以优化速度(简化矢量化)和舍入误差.

Long answer: Numpy implements pair-wise summation as an optimization of both speed (it allows for easier vectorization) and rounding error.

可以在此处(功能pairwise_sum_@TYPE@).本质上,它执行以下操作:

The numpy sum-implementation can be found here (function pairwise_sum_@TYPE@). It essentially does the following:

  1. 如果数组的长度小于8,则执行常规的for循环求和.这就是为什么在您的情况下W < 4不会观察到奇怪结果的原因-两种情况下都将使用相同的for循环求和.
  2. 如果长度在8到128之间,则会将总和累加到8个仓中r[0]-r[7],然后将其加到((r[0] + r[1]) + (r[2] + r[3])) + ((r[4] + r[5]) + (r[6] + r[7])).
  3. 否则,它以递归方式将数组的两半相加.
  1. If the length of the array is less than 8, a regular for-loop summation is performed. This is why the strange result is not observed if W < 4 in your case - the same for-loop summation will be used in both cases.
  2. If the length is between 8 and 128, it accumulates the sums in 8 bins r[0]-r[7] then sums them by ((r[0] + r[1]) + (r[2] + r[3])) + ((r[4] + r[5]) + (r[6] + r[7])).
  3. Otherwise, it recursively sums two halves of the array.

因此,在第一种情况下,您会得到a.sum() = a[0] + a[1] + a[2] + a[3],而在第二种情况下,您会得到b.sum() = (a[0] + a[1]) + (a[2] + a[3]),从而导致a.sum() - b.sum() != 0.

Therefore, in the first case you get a.sum() = a[0] + a[1] + a[2] + a[3] and in the second case b.sum() = (a[0] + a[1]) + (a[2] + a[3]) which leads to a.sum() - b.sum() != 0.

这篇关于添加零时奇怪的numpy.sum行为的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆