numpy:有效地与索引数组求和 [英] numpy: efficiently summing with index arrays
问题描述
假设我有2个矩阵M和N(都具有> 1列).我也有一个包含2列的索引矩阵I-M代表1列,N代表1列.N的索引是唯一的,但是M的索引可能会出现多次.我想执行的操作是
Suppose I have 2 matrices M and N (both have > 1 columns). I also have an index matrix I with 2 columns -- 1 for M and one for N. The indices for N are unique, but the indices for M may appear more than once. The operation I would like to perform is,
for i,j in w:
M[i] += N[j]
除了for循环之外,还有其他更有效的方法吗?
Is there a more efficient way to do this other than a for loop?
推荐答案
出于完整性考虑,在numpy> = 1.8中,您还可以使用np.add
的at
方法:
For completeness, in numpy >= 1.8 you can also use np.add
's at
method:
In [8]: m, n = np.random.rand(2, 10)
In [9]: m_idx, n_idx = np.random.randint(10, size=(2, 20))
In [10]: m0 = m.copy()
In [11]: np.add.at(m, m_idx, n[n_idx])
In [13]: m0 += np.bincount(m_idx, weights=n[n_idx], minlength=len(m))
In [14]: np.allclose(m, m0)
Out[14]: True
In [15]: %timeit np.add.at(m, m_idx, n[n_idx])
100000 loops, best of 3: 9.49 us per loop
In [16]: %timeit np.bincount(m_idx, weights=n[n_idx], minlength=len(m))
1000000 loops, best of 3: 1.54 us per loop
除了明显的性能劣势外,它还有两个优点:
Aside of the obvious performance disadvantage, it has a couple of advantages:
-
np.bincount
将其权重转换为双精度浮点数,.at
将与您数组的本机类型一起使用.这使其成为处理例如带有复数. -
np.bincount
仅将权重加在一起,对于所有ufuncs都有一个at
方法,因此您可以重复地multiply
或logical_and
或任何自己喜欢的方式.
np.bincount
converts its weights to double precision floats,.at
will operate with you array's native type. This makes it the simplest option for dealing e.g. with complex numbers.np.bincount
only adds weights together, you have anat
method for all ufuncs, so you can repeatedlymultiply
, orlogical_and
, or whatever you feel like.
但是对于您的用例,np.bincount
可能是可行的方式.
But for your use case, np.bincount
is probably the way to go.
这篇关于numpy:有效地与索引数组求和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!