有效地删除NumPy中的行 [英] Removing rows in NumPy efficiently
本文介绍了有效地删除NumPy中的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个带有很多ID值的大型numpy数组(称为X):
I have a large numpy array with a lot of ID values (call it X):
X:
id rating
1 88
2 99
3 77
4 66
...
等我还有另一个numpy的错误ID"数组-表示我想从X删除的行.
etc. I also have another numpy array of "bad IDs" -- which signify rows I'd like to remove from X.
B: [2, 3]
所以,当我完成后,我想要:
So when I'm done, I'd like:
X:
id rating
1 88
4 66
在不进行迭代的情况下最干净的方法是什么?
What is the cleanest way to do this, without iterating?
推荐答案
这是我想出的最快方法:
This is the fastest way I could come up with:
import numpy
x = numpy.arange(1000000, dtype=numpy.int32).reshape((-1,2))
bad = numpy.arange(0, 1000000, 2000, dtype=numpy.int32)
print x.shape
print bad.shape
cleared = numpy.delete(x, numpy.where(numpy.in1d(x[:,0], bad)), 0)
print cleared.shape
此打印:
(500000, 2)
(500,)
(499500, 2)
,并且运行速度比ufunc快得多.它将使用一些额外的内存,但是对您来说是否还好取决于阵列的大小.
and runs much faster than a ufunc. It will use some extra memory, but whether that's okay for you depends on how big your array is.
说明:
- numpy.in1d 返回一个数组,与
x
相同的大小 如果元素位于bad
数组中,则包含True
;以及False
否则. - numpy.where 将
/ False
数组变成包含索引值的整数数组,其中该数组为True
. - 然后它将索引位置传递到 numpy.delete ,告诉它沿第一个轴(0)删除
- The numpy.in1d returns an array the same size as
x
containingTrue
if the element is in thebad
array, andFalse
otherwise. - The numpy.where turns that
True
/False
array into an array of integers containing the index values where the array wasTrue
. - It then passes the index locations to numpy.delete, telling it to delete along the first axis (0)
这篇关于有效地删除NumPy中的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文