Python numpy按条件过滤二维数组 [英] Python numpy filter two-dimensional array by condition
问题描述
此处是Python新手,我已阅读过滤numpy数组的行吗?和文档,但仍然不知道如何以python方式进行编码.
Python newbie here, I have read Filter rows of a numpy array? and the doc but still can't figure out how to code it the python way.
示例数组我有:(实际数据是50000 x 10)
Example array I have: (the real data is 50000 x 10)
a = numpy.asarray([[2,'a'],[3,'b'],[4,'c'],[5,'d']])
filter = ['a','c']
我需要用a[:, 1] in filter
查找a
中的所有行.预期结果:
I need to find all rows in a
with a[:, 1] in filter
. Expected result:
[[2,'a'],[4,'c']]
我当前的代码是这样:
numpy.asarray([x for x in a if x[1] in filter ])
可以,但是我在某处读到它效率不高.正确的numpy方法是什么?
It works okay but I have read somewhere that it is not efficient. What is the proper numpy method for this?
感谢所有正确答案!不幸的是,我只能将其中一个标记为可接受的答案.我很惊讶Google搜索numpy filter 2d array
时没有出现numpy.in1d
.
Thanks for all the correct answers! Unfortunately I can only mark one as accepted answer. I am surprised that numpy.in1d
is not turned up in google searchs for numpy filter 2d array
.
推荐答案
You can use a bool
index array that you can produce using np.in1d
.
您可以在任何 axis
上为np.ndarray
编制索引您想要使用例如bool
的数组,该数组指示是否应包含一个元素.由于要沿axis=0
进行索引,这意味着要从最远的索引中进行选择,因此需要具有1D np.array
,其长度为行数.每个元素都将指示是否应包含该行.
You can index a np.ndarray
along any axis
you want using for example an array of bool
s indicating whether an element should be included. Since you want to index along axis=0
, meaning you want to choose from the outest index, you need to have 1D np.array
whose length is the number of rows. Each of its elements will indicate whether the row should be included.
一种实现此目的的快速方法是使用 np.in1d
在a
的第二列上.您可以通过a[:, 1]
获得该列的所有元素.现在您有了一个1D np.array
,应该根据您的过滤器检查其元素.那就是 np.in1d
的目的.
A fast way to get this is to use np.in1d
on the second column of a
. You get all elements of that column by a[:, 1]
. Now you have a 1D np.array
whose elements should be checked against your filter. Thats what np.in1d
is for.
因此完整的代码如下:
import numpy as np
a = np.asarray([[2,'a'],[3,'b'],[4,'c'],[5,'d']])
filter = np.asarray(['a','c'])
a[np.in1d(a[:, 1], filter)]
或更长的形式:
import numpy as np
a = np.asarray([[2,'a'],[3,'b'],[4,'c'],[5,'d']])
filter = np.asarray(['a','c'])
mask = np.in1d(a[:, 1], filter)
a[mask]
这篇关于Python numpy按条件过滤二维数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!