按值从多维numpy数组中删除元素 [英] Delete element from multi-dimensional numpy array by value
问题描述
给出一个numpy数组
Given a numpy array
a = np.array([[0, -1, 0], [1, 0, 0], [1, 0, -1]])
删除值-1
的所有元素以获取格式为数组的最快方法是什么
what's the fastest way to delete all elements of value -1
to get an array of the form
np.array([[0, 0], [1, 0, 0], [1, 0]])
推荐答案
您可能会考虑的另一种方法:
Another method you might consider:
def iterative_numpy(a):
mask = a != 1
out = np.array([ a[i,mask[i]] for i xrange(a.shape[0]) ])
return out
Divakar的方法loop_compr_based
计算沿掩码行的总和以及该结果的累积总和.此方法避免了这样的求和,但仍然必须遍历a
的行.它还返回一个数组数组.烦人的是,必须使用语法out[1][2]
而不是out[1,2]
来索引out
.用矩阵随机整数矩阵比较时间:
Divakar's method loop_compr_based
calculates sums along the rows of mask and a cumulative sum of that result. This method avoids such summations but still has to iterate through the rows of a
. It also returns an array of arrays. This has the annoyance that out
has to be indexed with the syntax out[1][2]
rather than out[1,2]
. Comparing the times with a matrix random integer matrices:
In [4]: a = np.random.random_integers(-1,1, size = (3,30))
In [5]: %timeit iterative_numpy(a)
100000 loops, best of 3: 11.1 us per loop
In [6]: %timeit loop_compr_based(a)
10000 loops, best of 3: 20.2 us per loop
In [7]: a = np.random.random_integers(-1,1, size = (30,3))
In [8]: %timeit iterative_numpy(a)
10000 loops, best of 3: 59.5 us per loop
In [9]: %timeit loop_compr_based(a)
10000 loops, best of 3: 30.8 us per loop
In [10]: a = np.random.random_integers(-1,1, size = (30,30))
In [11]: %timeit iterative_numpy(a)
10000 loops, best of 3: 64.6 us per loop
In [12]: %timeit loop_compr_based(a)
10000 loops, best of 3: 36 us per loop
当列多于行时,iterative_numpy
胜出.如果行数多于列数,则loop_compr_based
会获胜,但首先转置a
会提高两种方法的性能.如果尺寸相同,则loop_compr_based
最好.
When there are more columns than rows, iterative_numpy
wins out. When there are more rows than columns, loop_compr_based
wins but transposing a
first will improve the performance of both methods. When the dimensions are comparably the same, loop_compr_based
is best.
在实现之外,需要特别注意的是,任何形状不均匀的numpy数组 not 都不是实际数组,因为这些值不会占用内存的连续部分,并且此外,通常的阵列操作将无法按预期进行.
Outside of the implementation, it's important to note that any numpy array which has a non-uniform shape is not an actual array in the sense that the values do not occupy a contiguous section of memory and further, the usual array operations will not work as expected.
例如:
>>> a = np.array([[1,2,3],[1,2],[1]])
>>> a*2
array([[1, 2, 3, 1, 2, 3], [1, 2, 1, 2], [1, 1]], dtype=object)
请注意,numpy实际上会告诉我们这不是通常的带注释dtype=object
的numpy数组.
Notice that numpy actually informs us that this is not the usual numpy array with the note dtype=object
.
因此,最好仅创建一个numpy数组的列表并相应地使用它们.
Thus it might be best to just make a list of numpy arrays and use them accordingly.
这篇关于按值从多维numpy数组中删除元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!