numpy:有效地与索引数组求和 [英] numpy: efficiently summing with index arrays

查看:120
本文介绍了numpy:有效地与索引数组求和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有2个矩阵M和N(都具有> 1列).我也有一个包含2列的索引矩阵I-M代表1列,N代表1列.N的索引是唯一的,但是M的索引可能会出现多次.我想执行的操作是

Suppose I have 2 matrices M and N (both have > 1 columns). I also have an index matrix I with 2 columns -- 1 for M and one for N. The indices for N are unique, but the indices for M may appear more than once. The operation I would like to perform is,

for i,j in w:
  M[i] += N[j]

除了for循环之外,还有其他更有效的方法吗?

Is there a more efficient way to do this other than a for loop?

推荐答案

出于完整性考虑,在numpy> = 1.8中,您还可以使用np.addat方法:

For completeness, in numpy >= 1.8 you can also use np.add's at method:

In [8]: m, n = np.random.rand(2, 10)

In [9]: m_idx, n_idx = np.random.randint(10, size=(2, 20))

In [10]: m0 = m.copy()

In [11]: np.add.at(m, m_idx, n[n_idx])

In [13]: m0 += np.bincount(m_idx, weights=n[n_idx], minlength=len(m))

In [14]: np.allclose(m, m0)
Out[14]: True

In [15]: %timeit np.add.at(m, m_idx, n[n_idx])
100000 loops, best of 3: 9.49 us per loop

In [16]: %timeit np.bincount(m_idx, weights=n[n_idx], minlength=len(m))
1000000 loops, best of 3: 1.54 us per loop

除了明显的性能劣势外,它还有两个优点:

Aside of the obvious performance disadvantage, it has a couple of advantages:

  1. np.bincount将其权重转换为双精度浮点数,.at将与您数组的本机类型一起使用.这使其成为处理例如带有复数.
  2. np.bincount仅将权重加在一起,对于所有ufuncs都有一个at方法,因此您可以重复地multiplylogical_and或任何自己喜欢的方式.
  1. np.bincount converts its weights to double precision floats, .at will operate with you array's native type. This makes it the simplest option for dealing e.g. with complex numbers.
  2. np.bincount only adds weights together, you have an at method for all ufuncs, so you can repeatedly multiply, or logical_and, or whatever you feel like.

但是对于您的用例,np.bincount可能是可行的方式.

But for your use case, np.bincount is probably the way to go.

这篇关于numpy:有效地与索引数组求和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆