按值从多维numpy数组中删除元素 [英] Delete element from multi-dimensional numpy array by value

查看:760
本文介绍了按值从多维numpy数组中删除元素的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给出一个numpy数组

Given a numpy array

a = np.array([[0, -1, 0], [1, 0, 0], [1, 0, -1]])

删除值-1的所有元素以获取格式为数组的最快方法是什么

what's the fastest way to delete all elements of value -1 to get an array of the form

np.array([[0, 0], [1, 0, 0], [1, 0]])

推荐答案

您可能会考虑的另一种方法:

Another method you might consider:

def iterative_numpy(a):
    mask = a != 1
    out = np.array([ a[i,mask[i]] for i xrange(a.shape[0]) ])
    return out

Divakar的方法loop_compr_based计算沿掩码行的总和以及该结果的累积总和.此方法避免了这样的求和,但仍然必须遍历a的行.它还返回一个数组数组.烦人的是,必须使用语法out[1][2]而不是out[1,2]来索引out.用矩阵随机整数矩阵比较时间:

Divakar's method loop_compr_based calculates sums along the rows of mask and a cumulative sum of that result. This method avoids such summations but still has to iterate through the rows of a. It also returns an array of arrays. This has the annoyance that out has to be indexed with the syntax out[1][2] rather than out[1,2]. Comparing the times with a matrix random integer matrices:

In [4]: a = np.random.random_integers(-1,1, size = (3,30))

In [5]: %timeit iterative_numpy(a)
100000 loops, best of 3: 11.1 us per loop

In [6]: %timeit loop_compr_based(a)
10000 loops, best of 3: 20.2 us per loop

In [7]: a = np.random.random_integers(-1,1, size = (30,3))

In [8]: %timeit iterative_numpy(a)
10000 loops, best of 3: 59.5 us per loop

In [9]: %timeit loop_compr_based(a)
10000 loops, best of 3: 30.8 us per loop

In [10]: a = np.random.random_integers(-1,1, size = (30,30))

In [11]: %timeit iterative_numpy(a)
10000 loops, best of 3: 64.6 us per loop

In [12]: %timeit loop_compr_based(a)
10000 loops, best of 3: 36 us per loop

当列多于行时,iterative_numpy胜出.如果行数多于列数,则loop_compr_based会获胜,但首先转置a会提高两种方法的性能.如果尺寸相同,则loop_compr_based最好.

When there are more columns than rows, iterative_numpy wins out. When there are more rows than columns, loop_compr_based wins but transposing a first will improve the performance of both methods. When the dimensions are comparably the same, loop_compr_based is best.

在实现之外,需要特别注意的是,任何形状不均匀的numpy数组 not 都不是实际数组,因为这些值不会占用内存的连续部分,并且此外,通常的阵列操作将无法按预期进行.

Outside of the implementation, it's important to note that any numpy array which has a non-uniform shape is not an actual array in the sense that the values do not occupy a contiguous section of memory and further, the usual array operations will not work as expected.

例如:

>>> a = np.array([[1,2,3],[1,2],[1]])
>>> a*2
array([[1, 2, 3, 1, 2, 3], [1, 2, 1, 2], [1, 1]], dtype=object)

请注意,numpy实际上会告诉我们这不是通常的带注释dtype=object的numpy数组.

Notice that numpy actually informs us that this is not the usual numpy array with the note dtype=object.

因此,最好仅创建一个numpy数组的列表并相应地使用它们.

Thus it might be best to just make a list of numpy arrays and use them accordingly.

这篇关于按值从多维numpy数组中删除元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆