是什么导致numpy中的C和F有序数组沿轴的数组总和不同 [英] what causes different in array sum along axis for C versus F ordered arrays in numpy
问题描述
我很好奇,是否有人可以解释到底是什么导致了numpy
中C与Fortran有序数组的这种特殊处理方式的差异.请参见下面的代码:
I am curious if anyone can explain what exactly leads to the discrepancy in this particular handling of C versus Fortran ordered arrays in numpy
. See the code below:
system:
Ubuntu 18.10
Miniconda python 3.7.1
numpy 1.15.4
def test_array_sum_function(arr):
idx=0
val1 = arr[idx, :].sum()
val2 = arr.sum(axis=(1))[idx]
print('axis sums:', val1)
print(' ', val2)
print(' equal:', val1 == val2)
print('total sum:', arr.sum())
n = 2_000_000
np.random.seed(42)
rnd = np.random.random(n)
print('Fortran order:')
arrF = np.zeros((2, n), order='F')
arrF[0, :] = rnd
test_array_sum_function(arrF)
print('\nC order:')
arrC = np.zeros((2, n), order='C')
arrC[0, :] = rnd
test_array_sum_function(arrC)
打印:
Fortran order:
axis sums: 999813.1414744433
999813.1414744079
equal: False
total sum: 999813.1414744424
C order:
axis sums: 999813.1414744433
999813.1414744433
equal: True
total sum: 999813.1414744433
推荐答案
几乎可以肯定这是numpy的结果,有时会使用有时不是.
This is almost certainly a consequence of numpy sometimes using pairwise summation and sometimes not.
让我们建立一个诊断阵列:
Let's build a diagnostic array:
eps = (np.nextafter(1.0, 2)-1.0) / 2
1+eps+eps+eps
# 1.0
(1+eps)+(eps+eps)
# 1.0000000000000002
X = np.full((32, 32), eps)
X[0, 0] = 1
X.sum(0)[0]
# 1.0
X.sum(1)[0]
# 1.000000000000003
X[:, 0].sum()
# 1.000000000000003
这强烈表明一维数组和连续轴使用成对求和,而多维数组中的跨轴不使用.
This strongly suggests that 1D arrays and contiguous axes use pairwise summation while strided axes in a multidimensional array don't.
请注意,要看到这种效果,数组必须足够大,否则numpy会退回到普通求和.
Note that to see that effect the array has to be large enough, otherwise numpy falls back to ordinary summation.
这篇关于是什么导致numpy中的C和F有序数组沿轴的数组总和不同的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!