Python numpy按条件过滤二维数组 [英] Python numpy filter two-dimensional array by condition

查看:1312
本文介绍了Python numpy按条件过滤二维数组的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

此处是Python新手,我已阅读过滤numpy数组的行吗?和文档,但仍然不知道如何以python方式进行编码.

Python newbie here, I have read Filter rows of a numpy array? and the doc but still can't figure out how to code it the python way.

示例数组我有:(实际数据是50000 x 10)

Example array I have: (the real data is 50000 x 10)

a = numpy.asarray([[2,'a'],[3,'b'],[4,'c'],[5,'d']])
filter = ['a','c']

我需要用a[:, 1] in filter查找a中的所有行.预期结果:

I need to find all rows in a with a[:, 1] in filter. Expected result:

[[2,'a'],[4,'c']]

我当前的代码是这样:

numpy.asarray([x for x in a if x[1] in filter ])

可以,但是我在某处读到它效率不高.正确的numpy方法是什么?

It works okay but I have read somewhere that it is not efficient. What is the proper numpy method for this?

感谢所有正确答案!不幸的是,我只能将其中一个标记为可接受的答案.我很惊讶Google搜索numpy filter 2d array时没有出现numpy.in1d.

Thanks for all the correct answers! Unfortunately I can only mark one as accepted answer. I am surprised that numpy.in1d is not turned up in google searchs for numpy filter 2d array.

推荐答案

您可以使用通过

You can use a bool index array that you can produce using np.in1d.

您可以在任何 axis上为np.ndarray编制索引您想要使用例如bool的数组,该数组指示是否应包含一个元素.由于要沿axis=0进行索引,这意味着要从最远的索引中进行选择,因此需要具有1D np.array,其长度为行数.每个元素都将指示是否应包含该行.

You can index a np.ndarray along any axis you want using for example an array of bools indicating whether an element should be included. Since you want to index along axis=0, meaning you want to choose from the outest index, you need to have 1D np.array whose length is the number of rows. Each of its elements will indicate whether the row should be included.

一种实现此目的的快速方法是使用 np.in1d a的第二列上.您可以通过a[:, 1]获得该列的所有元素.现在您有了一个1D np.array,应该根据您的过滤器检查其元素.那就是 np.in1d 的目的.

A fast way to get this is to use np.in1d on the second column of a. You get all elements of that column by a[:, 1]. Now you have a 1D np.array whose elements should be checked against your filter. Thats what np.in1d is for.

因此完整的代码如下:

import numpy as np

a = np.asarray([[2,'a'],[3,'b'],[4,'c'],[5,'d']])
filter = np.asarray(['a','c'])
a[np.in1d(a[:, 1], filter)]

或更长的形式:

import numpy as np

a = np.asarray([[2,'a'],[3,'b'],[4,'c'],[5,'d']])
filter = np.asarray(['a','c'])
mask = np.in1d(a[:, 1], filter)
a[mask]

这篇关于Python numpy按条件过滤二维数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆