使用TxK列索引数组从TxN numpy数组中选择TxK numpy数组 [英] pick TxK numpy array from TxN numpy array using TxK column index array
问题描述
这是间接索引问题。
可以使用列表解析来解决。
It can be solved with a list comprehension.
问题是,是否,或者如何在numpy内解决它,
The question is whether, or, how to solve it within numpy,
当
data.shape
是(T,N)
和
c .shape
是(T,K)
以及<$ c的每个元素$ c> c 是一个介于0和N-1之间的 int
,即
c
旨在引用数据
中的列号。
and each element of c
is an int
between 0 and N-1 inclusive, that is,
each element of c
is intended to refer to a column number from data
.
目标是获得 out
其中
out.shape = (T,K)
并且 i
in 0 ..(T-1)
行 out [i] = [data [ i,c [i,0]],...,data [i,c [i,K-1]]]
具体示例:
data = np.array([\
[ 0, 1, 2],\
[ 3, 4, 5],\
[ 6, 7, 8],\
[ 9, 10, 11],\
[12, 13, 14]])
c = np.array([
[0, 2],\
[1, 2],\
[0, 0],\
[1, 1],\
[2, 2]])
out should be out = [[0, 2], [4, 5], [6, 6], [10, 10], [14, 14]]
输出的第一行是[0,2],因为选择的列由c的第0行给出,它们是0和2,第0列和第2列的数据[0]是0和2.
The first row of out is [0,2] because the columns chosen are given by c's row 0, they are 0 and 2, and data[0] at columns 0 and 2 are 0 and 2.
第二行输出是[4,5]因为列选择由c的第1行给出,它们是1和2,第1列和第2列的数据[1]是4和5.
The second row of out is [4,5] because the columns chosen are given by c's row 1, they are 1 and 2, and data[1] at columns 1 and 2 is 4 and 5.
Numpy花式索引不是似乎以一种显而易见的方式解决这个问题,因为用c索引数据(例如 data [c]
, np.take(data,c,axis = 1)
)总是生成一个三维数组。
Numpy fancy indexing doesn't seem to solve this in an obvious way because indexing data with c (e.g. data[c]
, np.take(data,c,axis=1)
) always produces a 3 dimensional array.
列表理解可以解决它:
out = [[data [rowidx] ,i1],数据[rowidx,i2]]用于(rowidx,(i1,i2))枚举(c)]
如果K是2我认为这是勉强的。如果K是可变的,那就不太好了。
if K is 2 I suppose this is marginally OK. If K is variable, this is not so good.
必须为每个值K重写列表推导,因为它将从数据
中选出的列展开每行 c
。它也违反了DRY。
The list comprehension has to be rewritten for each value K, because it unrolls the columns picked out of data
by each row of c
. It also violates DRY.
是否存在完全基于 numpy
的解决方案?
Is there a solution based entirely in numpy
?
推荐答案
你可以用 np.choose :
In [1]: %cpaste
Pasting code; enter '--' alone on the line to stop or use Ctrl-D.
data = np.array([\
[ 0, 1, 2],\
[ 3, 4, 5],\
[ 6, 7, 8],\
[ 9, 10, 11],\
[12, 13, 14]])
c = np.array([
[0, 2],\
[1, 2],\
[0, 0],\
[1, 1],\
[2, 2]])
--
In [2]: np.choose(c, data.T[:,:,np.newaxis])
Out[2]:
array([[ 0, 2],
[ 4, 5],
[ 6, 6],
[10, 10],
[14, 14]])
这篇关于使用TxK列索引数组从TxN numpy数组中选择TxK numpy数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!