在NumPy中获取ndarray的索引和值 [英] Get indices and values of an ndarray in NumPy
问题描述
我有一个任意维数N
的ndarray A
.我想创建一个元组(数组或列表)的数组B
,其中每个元组中的第一个N
元素是索引,而最后一个元素是A
中该索引的值.
I have a ndarray A
of arbitrary number of dimensions N
. I want to create an array B
of tuples (array, or lists) where the first N
elements in each tuple are the index and the last element is the value of that index in A
.
例如:
A = array([[1, 2, 3], [4, 5, 6]])
然后
B = [(0, 0, 1), (0, 1, 2), (0, 2, 3), (1, 0, 4), (1, 1, 5), (1, 2, 6)]
在NumPy中没有for
循环的最佳/最快方法是什么?
What is best/fastest way to do this in NumPy without for
loops?
推荐答案
如果您使用的是Python 3,则非常简单(且速度适中)的方法是(使用
If you have Python 3 a very simple (and moderately fast) way would be (using np.ndenumerate
):
>>> import numpy as np
>>> A = np.array([[1, 2, 3], [4, 5, 6]])
>>> [(*idx, val) for idx, val in np.ndenumerate(A)]
[(0, 0, 1), (0, 1, 2), (0, 2, 3), (1, 0, 4), (1, 1, 5), (1, 2, 6)]
如果您希望它同时适用于Python 3和Python 2,将会有些不同,因为Python 2不允许在元组文字内部进行可迭代的拆包.但是您可以使用元组串联(加法):
It would be a bit different if you want it to work for both Python 3 and Python 2, because Python 2 doesn't allow iterable unpacking inside a tuple literal. But you could use tuple concatenation (addition):
>>> [idx + (val,) for idx, val in np.ndenumerate(A)]
[(0, 0, 1), (0, 1, 2), (0, 2, 3), (1, 0, 4), (1, 1, 5), (1, 2, 6)]
如果您想完全呆在NumPy中,最好使用 np.mgrid
:
If you want to completely stay within NumPy it would be better to create the indices with np.mgrid
:
>>> grid = np.mgrid[:A.shape[0], :A.shape[1]] # indices!
>>> np.stack([grid[0], grid[1], A]).reshape(3, -1).T
array([[0, 0, 1],
[0, 1, 2],
[0, 2, 3],
[1, 0, 4],
[1, 1, 5],
[1, 2, 6]])
但是,这需要循环才能将其转换为元组列表...但是将其转换为列表列表很容易:
However that would require a loop to convert it to a list of tuples... But it would be easy to convert it to a list of list:
>>> np.stack([grid[0], grid[1], A]).reshape(3, -1).T.tolist()
[[0, 0, 1], [0, 1, 2], [0, 2, 3], [1, 0, 4], [1, 1, 5], [1, 2, 6]]
在没有可见 for
-循环的情况下,也可以使用元组列表:
The list of tuples is also possible without visible for
-loop:
>>> list(map(tuple, np.stack([grid[0], grid[1], A]).reshape(3, -1).T.tolist()))
[(0, 0, 1), (0, 1, 2), (0, 2, 3), (1, 0, 4), (1, 1, 5), (1, 2, 6)]
即使没有可见的for
循环,tolist
,list
,tuple
和map
也确实在Python层中隐藏了for
循环.
Even though there is no visible for
-loop the tolist
, list
, tuple
and the map
do hide a for
-loop in the Python layer.
对于任意维度数组,您需要稍微更改后者:
For arbitary dimensional arrays you need to change the latter approach a bit:
coords = tuple(map(slice, A.shape))
grid = np.mgrid[coords]
# array version
np.stack(list(grid) + [A]).reshape(A.ndim+1, -1).T
# list of list version
np.stack(list(grid) + [A]).reshape(A.ndim+1, -1).T.tolist()
# list of tuple version
list(map(tuple, np.stack(list(grid) + [A]).reshape(A.ndim+1, -1).T.tolist()))
ndenumerate
方法适用于任何尺寸的数组而无需更改,并且根据我的选择,它的速度只会慢2-3倍.
The ndenumerate
approach would work for arrays of any dimensions without change and according to my timings only be 2-3 times slower.
这篇关于在NumPy中获取ndarray的索引和值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!