删除矩阵子视图中的第一个元素 [英] delete the first element in subview of a matrix
问题描述
我有一个像这样的数据集:
I have a dataset like this:
[[0,1],
[0,2],
[0,3],
[0,4],
[1,5],
[1,6],
[1,7],
[2,8],
[2,9]]
我需要删除第一列所定义的数据每个子视图的第一元素.因此,首先获取第一列中所有具有0的元素,然后删除第一行:[0,1].然后,我在第一列中获得带有1的元素,并删除第一行[1,5],下一步我删除[2,8],依此类推.最后,我想要一个像这样的数据集:
I need to delete the first elements of each subview of the data as defined by the first column. So first I get all elements that have 0 in the first column, and delete the first row: [0,1]. Then I get the elements with 1 in the first column and delete the first row [1,5], next step I delete [2,8] and so on and so forth. In the end, I would like to have a dataset like this:
[[0,2],
[0,3],
[0,4],
[1,6],
[1,7],
[2,9]]
这可以在numpy中完成吗?我的数据集非常大,因此所有元素上的for循环至少需要4分钟才能完成.
Can this be done in numpy? My dataset is very large so for loops on all elements take at least 4 minutes to complete.
推荐答案
根据要求,提供numpy
解决方案:
As requested, a numpy
solution:
import numpy as np
a = np.array([[0,1], [0,2], [0,3], [0,4], [1,5], [1,6], [1,7], [2,8], [2,9]])
_,i = np.unique(a[:,0], return_index=True)
b = np.delete(a, i, axis=0)
(以上内容经过编辑后加入了@Jaime的解决方案,这是我为后代着想的原始遮罩解决方案)
(above is edited to incorporate @Jaime's solution, here is my original masking solution for posterity's sake)
m = np.ones(len(a), dtype=bool)
m[i] = False
b = a[m]
有趣的是,面具似乎更快:
Interestingly, the mask seems to be faster:
In [225]: def rem_del(a):
.....: _,i = np.unique(a[:,0], return_index=True)
.....: return np.delete(a, i, axis = 0)
.....:
In [226]: def rem_mask(a):
.....: _,i = np.unique(a[:,0], return_index=True)
.....: m = np.ones(len(a), dtype=bool)
.....: m[i] = False
.....: return a[m]
.....:
In [227]: timeit rem_del(a)
10000 loops, best of 3: 181 us per loop
In [228]: timeit rem_mask(a)
10000 loops, best of 3: 59 us per loop
这篇关于删除矩阵子视图中的第一个元素的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!