二维数组中索引对之间的Numpy总和 [英] Numpy sum between pairs of indices in 2d array

查看:42
本文介绍了二维数组中索引对之间的Numpy总和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个二维 numpy 数组 (MxN) 和另外两个一维数组 (Mx1),它们代表我想要求和的二维数组每一行的起始和结束索引.我正在寻找在大型数组中执行此操作的最有效方法(最好无需使用循环,这就是我目前正在做的).我想做的一个例子如下.

<预><代码>>>>随机种子(1234)>>>a = random.rand(4,4)>>>打印一个[[ 0.19151945 0.62210877 0.43772774 0.78535858][ 0.77997581 0.27259261 0.27646426 0.80187218][ 0.95813935 0.87593263 0.35781727 0.50099513][ 0.68346294 0.71270203 0.37025075 0.56119619]]>>>b = 数组([1,0,2,1])>>>c = 数组([3,2,4,4])>>>d = 空(4)>>>对于 xrange(4) 中的 i:d[i] = sum(a[i, b[i]:c[i]])>>>打印[ 1.05983651 1.05256841 0.8588124 1.64414897]

我的问题类似于以下问题,但是,我认为那里提出的解决方案不会非常有效.索引对之间子数组中的数值总和 在那个问题中,他们想要找到同一行的多个子集的总和,因此可以使用 cumsum().但是,我每行只会找到一个总和,所以我认为这不是计算总和的最有效方法.

对不起,我在代码中犯了一个错误.循环内的那一行先前读取的是 d[i] = sum(a[b[i]:c[i]]).我忘记了第一维的索引.每组起始和结束索引对应于二维数组中的一个新行.

解决方案

你可以这样做:

from numpy 导入数组,随机,零随机种子(1234)a = random.rand(4,4)b = 数组([1,0,2,1])c = 数组([3,2,4,4])查找 = zeros(len(a) + 1, a.dtype)查找[1:] = a.sum(1).cumsum()d = 查找[c] - 查找[b]打印

如果您的 b/c 数组很大并且您求和的窗口很大,这可能会有所帮助.因为您的窗口可能会重叠,例如 2:4 和 1:4 几乎相同,所以您实际上是在重复操作.通过将 cumsum 作为每个处理步骤,您可以减少重复操作的次数,并且可以节省时间.如果您的窗口很小并且 b/c 很小,这将无济于事,主要是因为您将对矩阵的部分求和,而您不太关心.希望有所帮助.

I have a 2-d numpy array (MxN) and two more 1-d arrays (Mx1) that represent starting and ending indices for each row of the 2-d array that I'd like to sum over. I'm looking for the most efficient way to do this in a large array (preferably without having to use a loop, which is what I'm currently doing). An example of what i'd like to do is the following.

>>> random.seed(1234)
>>> a = random.rand(4,4)
>>> print a
[[ 0.19151945  0.62210877  0.43772774  0.78535858]
 [ 0.77997581  0.27259261  0.27646426  0.80187218]
 [ 0.95813935  0.87593263  0.35781727  0.50099513]
 [ 0.68346294  0.71270203  0.37025075  0.56119619]]
>>> b = array([1,0,2,1])
>>> c = array([3,2,4,4])
>>> d = empty(4)
>>> for i in xrange(4):
    d[i] = sum(a[i, b[i]:c[i]]) 

>>> print d
[ 1.05983651  1.05256841  0.8588124   1.64414897]

My problem is similar to the following question, however, I don't think the solution presented there would be very efficient. Numpy sum of values in subarrays between pairs of indices In that question, they are wanting to find the sum of multiple subsets for the same row, so cumsum() can be used. However, I will only be finding one sum per row, so I don't think this would be the most efficient means of computing the sum.

Edit: I'm sorry, I made a mistake in my code. The line inside the loop previously read d[i] = sum(a[b[i]:c[i]]). I forgot the index for the first dimension. Each set of starting and ending indices corresponds to a new row in the 2-d array.

解决方案

You could do something like this:

from numpy import array, random, zeros
random.seed(1234)
a = random.rand(4,4)
b = array([1,0,2,1])
c = array([3,2,4,4])

lookup = zeros(len(a) + 1, a.dtype)
lookup[1:] = a.sum(1).cumsum()
d = lookup[c] - lookup[b]
print d

This might help if your b/c arrays are large and the windows you're summing over are large. Because your windows might overlap, for example 2:4 and 1:4 are mostly the same, you're essentially repeating operations. By taking the cumsum as a per-processing step you reduce the number of repeated operations and you may save time. This won't help much if your windows are small and b/c are small, mostly because you'll be summing parts of the a matrix that you don't much care about. Hope that helps.

这篇关于二维数组中索引对之间的Numpy总和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆