数组numpy的总和antidiagonals [英] numpy sum antidiagonals of array

查看:470
本文介绍了数组numpy的总和antidiagonals的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给定一个numpy的ndarray,我想先取两轴,并用新的轴,这是他们antidiagonals的总和取代它们。

在特定的,假设我有变量X,Y,Z,......,而我的阵列重新present的条目概率

 数组[I,J,K,...] = P(X = 1,y = j的,Z = K,...)

我想获得

  new_array [L,K,...] = P(X + Y = L,Z = K,...)= sum_i P(X = 1,Y =丽中,Z = K,...)

new_array [L,K,...] 都是的总和阵列[I,J,K,... ] ,使得 I + J = L

什么是numpy的做到这一点的最有效和/或干净的方式?

编辑补充:
在@hpaulj的建议,这里是显而易见的迭代求解:

  =阵列numpy.arange(30).reshape((2,3,5))
阵列=阵列/浮点(array.sum())#让它的概率
new_array = numpy.zeros([array.shape [0] + array.shape [1] - 1] +清单(array.shape [2:]))
因为我在范围内(array.shape [0]):
    对于在范围Ĵ(array.shape [1]):
        new_array [I + J,...] + =阵列[I,J,...]
new_array.sum()==#1


解决方案

有一个跟踪函数,给出了一个对角线的总和。您可以指定偏移和2轴(0和1是默认值)。并得到antidiagonal,你只需要翻转的一维。 np.flipud 这样做,虽然它只是 [:: - 1,...] 索引

把这些在一起,

  np.array([np.trace(np.flipud(阵列),对于k范围内的偏移量= K)(1,3)])

相匹配的 new_array

它仍然遍历的可能值(4在这种情况下)。 跟踪本身编译。

在这个小的情况下,它实际上比你双环(2×3步)慢。即使我移动 flipud 出内部循环,它仍然是慢。我不知道这是如何扩展为更大的阵列。

有进一步矢量化这一问题的部分原因是,事实上,每个角都有一个不同的长度。

 在[331]:%% timeit
数组1 =数组[:: - 1]
np.array([np.trace(数组1,对于k范围内的偏移量= K)(1,3)])
   .....:
10000循环,最好的3:每圈87.4微秒在[332]:%% timeit
new_array = np.zeros([array.shape [0] + array.shape [1] - 1] +清单(array.shape [2:]))
因为我在范围(2):
    对于在范围Ĵ(3):
        new_array [I + J] + =阵列[I,J]
   .....:
10000循环,最好的3:每圈43.5微秒

scipy.sparse 有一个直径格式,存储非零对角线的值。它存储值的填充数组,偏移一起。

 阵列([12,0,0,0],
       [8,13,0,0],
       [4,9,14,0],
       [0,5,10,15],
       [0,1,6,11],
       [0,0,2,7],
       [0,0,0,3]])
阵列([ - 3,-2,-1,0,1,2,3])

虽然这是围绕变量对角线长度的问题得到的一种方式,我不认为它有助于在这种情况下,你只需要他们的资金。

Given a numpy ndarray, I would like to take the first two axes, and replace them with a new axis, which is the sum of their antidiagonals.

In particular, suppose I have variables x,y,z,..., and the entries of my array represent the probability

array[i,j,k,...] = P(x=i, y=j, z=k, ...)

I would like to obtain

new_array[l,k,...] = P(x+y=l, z=k, ...) = sum_i P(x=i, y=l-i, z=k, ...)

i.e., new_array[l,k,...] is the sum of all array[i,j,k,...] such that i+j=l.

What is the most efficient and/or cleanest way to do this in numpy?

EDIT to add: On recommendation of @hpaulj, here is the obvious iterative solution:

array = numpy.arange(30).reshape((2,3,5))
array = array / float(array.sum()) # make it a probability
new_array = numpy.zeros([array.shape[0] + array.shape[1] - 1] + list(array.shape[2:]))
for i in range(array.shape[0]):
    for j in range(array.shape[1]):
        new_array[i+j,...] += array[i,j,...]
new_array.sum() # == 1

解决方案

There is a trace function that gives the sum of a diagonal. You can specify the offset and 2 axes (0 and 1 are the defaults). And to get the antidiagonal, you just need to flip one dimension. np.flipud does that, though it's just [::-1,...] indexing.

Putting those together,

np.array([np.trace(np.flipud(array),offset=k) for k in range(-1,3)])

matches your new_array.

It still loops over the possible values of l (4 in this case). trace itself is compiled.

In this small case, it's actually slower than your double loop (2x3 steps). Even if I move the flipud out of the inner loop, it is still slower. I don't know how this scales for larger arrays.

Part of the problem with vectorizing this even further is that fact that each diagonal has a different length.

In [331]: %%timeit
array1 = array[::-1]
np.array([np.trace(array1,offset=k) for k in range(-1,3)])
   .....: 
10000 loops, best of 3: 87.4 µs per loop

In [332]: %%timeit 
new_array = np.zeros([array.shape[0] + array.shape[1] - 1] + list(array.shape[2:]))                                                       
for i in range(2):
    for j in range(3):
        new_array[i+j] += array[i,j]
   .....: 
10000 loops, best of 3: 43.5 µs per loop

scipy.sparse has a dia format, which stores the values of nonzero diagonals. It stores a padded array of values, along with the offsets.

array([[12,  0,  0,  0],
       [ 8, 13,  0,  0],
       [ 4,  9, 14,  0],
       [ 0,  5, 10, 15],
       [ 0,  1,  6, 11],
       [ 0,  0,  2,  7],
       [ 0,  0,  0,  3]])
array([-3, -2, -1,  0,  1,  2,  3])

While that's a way of getting around the issue of variable diagonal lengths, I don't think it helps in this case where you just need their sums.

这篇关于数组numpy的总和antidiagonals的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆