用于从3-D数组中选择具有不同起始索引的相同长度子数组的纯numpy表达式 [英] pure numpy expression for selecting same-length subarrays with different starting indices from 3-D array
问题描述
我有一个形状为(74, 74, 4563)
的3-D numpy
数组(我们称其为a
),我想从前两个维的每个位置提取一个length- n
子数组.然而,取决于前两个维度中的索引,这些子阵列中的每一个在不同的地方开始. j
.
I have a 3-D numpy
array (let's call it a
) with shape (74, 74, 4563)
, and I want to extract a length-n
sub-array from each location in the first two dimensions. However, each of those sub-arrays starts in a different place, depending on the indices in the first two dimensions, i
& j
.
例如,如果n=1000
,我可能想要a[0, 0, 0:1000]
,但也想要a[0, 1, 2:1002]
,依此类推...我有一个2-d数组(称为ix0
),它是一个二维数组,它告诉我在每个i
/j
位置的每个子数组的起始位置.最后,我保证不会有任何溢出"-也就是说,ix0 + n
中的所有值都小于a
的维度2长度(因此我们不必担心要求超出当前范围的索引).
For example, if n=1000
, I may want a[0, 0, 0:1000]
, but also a[0, 1, 2:1002]
, etc... I have a 2-d array (called ix0
) which is a 2-d array that tells me where each sub-array starts for each i
/j
position. Finally, I am guaranteed that there will not be any "overflow"--that is, all the values in ix0 + n
are smaller than the dimension-2 length of a
(so we don't need to worry about asking for an index beyond the range that is present).
例如...
a = np.arange(74*74*4563).reshape(74, 74, 4563)
ix0 = np.arange(74*74).reshape(74,74)/2 + 50
a[:, :, ix0:ix0+n]
产生
IndexError: failed to coerce slice entry of type numpy.ndarray to integer
有没有一种方法可以不遍历所有i
/j
索引组合或创建大的遮罩数组?
Is there a way to do this without looping through all the i
/j
index combinations or creating a big mask array?
推荐答案
之前曾有人问过类似问题,但要求2d.我可能会尝试查找.
Something along this line has been asked before, but for 2d. I may try to look that up.
但这是2d情况下发生的情况的快速示例
But here's quick example of what was going on in the 2d case
In [1463]: x=np.arange(12).reshape(3,4)
In [1464]: ix0=np.array([0,2,1])
In [1465]: N=2
我们可以遍历x
的每一行,收集所需的N
长度切片,然后将它们加入列表或数组中.更为普遍的问题是切片的长度有所不同,在这种情况下,无法将它们重新组装成数组.
We could iterate over each row of x
, collecting the desired N
length slice, and then join them into a list or array. A more general problem varies the length of slices, in which case they can't be reassembled into an array.
In [1466]: [x[i,ix0[i]:ix0[i]+N] for i in range(3)]
Out[1466]: [array([0, 1]), array([6, 7]), array([ 9, 10])]
,然后将该列表包装在np.array
中.
and then wrap that list in np.array
.
另一种方法是先连接索引:
An alternative is to concatenate the indexes first:
In [1467]: x[np.arange(3)[:,None], np.array([np.r_[ix0[i]:ix0[i]+N] for i in range(3)])]
Out[1467]:
array([[ 0, 1],
[ 6, 7],
[ 9, 10]])
最后一个索引数组是:
In [1468]: np.array([np.r_[ix0[i]:ix0[i]+N] for i in range(3)])
Out[1468]:
array([[0, 1],
[2, 3],
[1, 2]])
要应用于3d情况,我们有两个选择.一种是将其重塑为2d,应用其中一种策略,然后重塑.另一个是概括我创建这些对象所采取的操作-不应太难,但需要进行一些实验.
To apply to the 3d case we have two options. One is reshape it to 2d, apply one of these strategies, and reshape back. The other is to generalize the action I took to create these - that shouldn't be too hard, but will take some experimenting.
通过广播不难创建最后一个数组.
That last array shouldn't be hard to create with broadcasting.
In [1469]: ix0[:,None]+np.arange(N)
Out[1469]:
array([[0, 1],
[2, 3],
[1, 2]])
In [1470]: x[np.arange(3)[:,None], ix0[:,None]+np.arange(N)]
Out[1470]:
array([[ 0, 1],
[ 6, 7],
[ 9, 10]])
现在,将其推广到3d应该更容易
Now it should be even easier to generalize to 3d
In [1487]: X=np.arange(2*3*10).reshape(2,3,10)
In [1488]: ix0=np.arange(2*3).reshape(2,3)
In [1489]: ix0[...,None]+np.arange(N)
Out[1489]:
array([[[0, 1],
[1, 2],
[2, 3]],
[[3, 4],
[4, 5],
[5, 6]]])
In [1490]: I,J,_=np.ix_(range(2),range(3),range(N))
In [1491]: I.shape
Out[1491]: (2, 1, 1)
In [1492]: J.shape
Out[1492]: (1, 3, 1)
In [1493]: X[I, J, ix0[...,None]+np.arange(N)]
Out[1493]:
array([[[ 0, 1],
[11, 12],
[22, 23]],
[[33, 34],
[44, 45],
[55, 56]]])
我应该确保值正确,但是形状匹配,在这种情况下,这是80%的工作.
I should make sure the values are right, but the shapes match, which in this sort of thing is 80% of work.
这篇关于用于从3-D数组中选择具有不同起始索引的相同长度子数组的纯numpy表达式的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!