数组numpy的总和antidiagonals [英] numpy sum antidiagonals of array
问题描述
给定一个numpy的ndarray,我想先取两轴,并用新的轴,这是他们antidiagonals的总和取代它们。
在特定的,假设我有变量X,Y,Z,......,而我的阵列重新present的条目概率
数组[I,J,K,...] = P(X = 1,y = j的,Z = K,...)
我想获得
new_array [L,K,...] = P(X + Y = L,Z = K,...)= sum_i P(X = 1,Y =丽中,Z = K,...)
即 new_array [L,K,...]
都是的总和阵列[I,J,K,... ]
,使得 I + J = L
。
什么是numpy的做到这一点的最有效和/或干净的方式?
编辑补充:
在@hpaulj的建议,这里是显而易见的迭代求解:
=阵列numpy.arange(30).reshape((2,3,5))
阵列=阵列/浮点(array.sum())#让它的概率
new_array = numpy.zeros([array.shape [0] + array.shape [1] - 1] +清单(array.shape [2:]))
因为我在范围内(array.shape [0]):
对于在范围Ĵ(array.shape [1]):
new_array [I + J,...] + =阵列[I,J,...]
new_array.sum()==#1
有一个跟踪
函数,给出了一个对角线的总和。您可以指定偏移和2轴(0和1是默认值)。并得到antidiagonal,你只需要翻转的一维。 np.flipud
这样做,虽然它只是 [:: - 1,...]
索引
把这些在一起,
np.array([np.trace(np.flipud(阵列),对于k范围内的偏移量= K)(1,3)])
相匹配的 new_array
。
它仍然遍历→
的可能值(4在这种情况下)。 跟踪
本身编译。
在这个小的情况下,它实际上比你双环(2×3步)慢。即使我移动 flipud
出内部循环,它仍然是慢。我不知道这是如何扩展为更大的阵列。
有进一步矢量化这一问题的部分原因是,事实上,每个角都有一个不同的长度。
在[331]:%% timeit
数组1 =数组[:: - 1]
np.array([np.trace(数组1,对于k范围内的偏移量= K)(1,3)])
.....:
10000循环,最好的3:每圈87.4微秒在[332]:%% timeit
new_array = np.zeros([array.shape [0] + array.shape [1] - 1] +清单(array.shape [2:]))
因为我在范围(2):
对于在范围Ĵ(3):
new_array [I + J] + =阵列[I,J]
.....:
10000循环,最好的3:每圈43.5微秒
scipy.sparse
有一个直径
格式,存储非零对角线的值。它存储值的填充数组,偏移一起。
阵列([12,0,0,0],
[8,13,0,0],
[4,9,14,0],
[0,5,10,15],
[0,1,6,11],
[0,0,2,7],
[0,0,0,3]])
阵列([ - 3,-2,-1,0,1,2,3])
虽然这是围绕变量对角线长度的问题得到的一种方式,我不认为它有助于在这种情况下,你只需要他们的资金。
Given a numpy ndarray, I would like to take the first two axes, and replace them with a new axis, which is the sum of their antidiagonals.
In particular, suppose I have variables x,y,z,..., and the entries of my array represent the probability
array[i,j,k,...] = P(x=i, y=j, z=k, ...)
I would like to obtain
new_array[l,k,...] = P(x+y=l, z=k, ...) = sum_i P(x=i, y=l-i, z=k, ...)
i.e., new_array[l,k,...]
is the sum of all array[i,j,k,...]
such that i+j=l
.
What is the most efficient and/or cleanest way to do this in numpy?
EDIT to add: On recommendation of @hpaulj, here is the obvious iterative solution:
array = numpy.arange(30).reshape((2,3,5))
array = array / float(array.sum()) # make it a probability
new_array = numpy.zeros([array.shape[0] + array.shape[1] - 1] + list(array.shape[2:]))
for i in range(array.shape[0]):
for j in range(array.shape[1]):
new_array[i+j,...] += array[i,j,...]
new_array.sum() # == 1
There is a trace
function that gives the sum of a diagonal. You can specify the offset and 2 axes (0 and 1 are the defaults). And to get the antidiagonal, you just need to flip one dimension. np.flipud
does that, though it's just [::-1,...]
indexing.
Putting those together,
np.array([np.trace(np.flipud(array),offset=k) for k in range(-1,3)])
matches your new_array
.
It still loops over the possible values of l
(4 in this case). trace
itself is compiled.
In this small case, it's actually slower than your double loop (2x3 steps). Even if I move the flipud
out of the inner loop, it is still slower. I don't know how this scales for larger arrays.
Part of the problem with vectorizing this even further is that fact that each diagonal has a different length.
In [331]: %%timeit
array1 = array[::-1]
np.array([np.trace(array1,offset=k) for k in range(-1,3)])
.....:
10000 loops, best of 3: 87.4 µs per loop
In [332]: %%timeit
new_array = np.zeros([array.shape[0] + array.shape[1] - 1] + list(array.shape[2:]))
for i in range(2):
for j in range(3):
new_array[i+j] += array[i,j]
.....:
10000 loops, best of 3: 43.5 µs per loop
scipy.sparse
has a dia
format, which stores the values of nonzero diagonals. It stores a padded array of values, along with the offsets.
array([[12, 0, 0, 0],
[ 8, 13, 0, 0],
[ 4, 9, 14, 0],
[ 0, 5, 10, 15],
[ 0, 1, 6, 11],
[ 0, 0, 2, 7],
[ 0, 0, 0, 3]])
array([-3, -2, -1, 0, 1, 2, 3])
While that's a way of getting around the issue of variable diagonal lengths, I don't think it helps in this case where you just need their sums.
这篇关于数组numpy的总和antidiagonals的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!