有效地删除NumPy中的行 [英] Removing rows in NumPy efficiently

查看:70
本文介绍了有效地删除NumPy中的行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个带有很多ID值的大型numpy数组(称为X):

I have a large numpy array with a lot of ID values (call it X):

X:
id   rating
1    88
2    99
3    77
4    66
...

等我还有另一个numpy的错误ID"数组-表示我想从X删除的行.

etc. I also have another numpy array of "bad IDs" -- which signify rows I'd like to remove from X.

B: [2, 3]

所以,当我完成后,我想要:

So when I'm done, I'd like:

X:
id   rating
1    88
4    66

在不进行迭代的情况下最干净的方法是什么?

What is the cleanest way to do this, without iterating?

推荐答案

这是我想出的最快方法:

This is the fastest way I could come up with:

import numpy

x = numpy.arange(1000000, dtype=numpy.int32).reshape((-1,2))
bad = numpy.arange(0, 1000000, 2000, dtype=numpy.int32)

print x.shape
print bad.shape

cleared = numpy.delete(x, numpy.where(numpy.in1d(x[:,0], bad)), 0)
print cleared.shape

此打印:

(500000, 2)
(500,)
(499500, 2)

,并且运行速度比ufunc快得多.它将使用一些额外的内存,但是对您来说是否还好取决于阵列的大小.

and runs much faster than a ufunc. It will use some extra memory, but whether that's okay for you depends on how big your array is.

说明:

  • numpy.in1d 返回一个数组,与x相同的大小 如果元素位于bad数组中,则包含True;以及 False否则.
  • numpy.where /False数组变成包含索引值的整数数组,其中该数组为True.
  • 然后它将索引位置传递到 numpy.delete ,告诉它沿第一个轴(0)删除
  • The numpy.in1d returns an array the same size as x containing True if the element is in the bad array, and False otherwise.
  • The numpy.where turns that True/False array into an array of integers containing the index values where the array was True.
  • It then passes the index locations to numpy.delete, telling it to delete along the first axis (0)

这篇关于有效地删除NumPy中的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆