pandas 使用单个条件过滤多列 [英] Pandas filter mulitple columns with single criteria

查看:44
本文介绍了 pandas 使用单个条件过滤多列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个100列以上的Excel工作表.我需要过滤其中大约五个,以查看其中一个单元格中哪些列为"no".有没有一种方法可以使用单个搜索条件来过滤多列,例如:

I have an excel sheet with over a hundred columns. I need to filter about five of these to see which columns have "no" in one of the cells. Is there a way to filter multiple columns with a single search criteria such as:

 no_invoice_filter = df[(df['M1: PL - INVOICED']) & (df['M2: EX - INVOICED']) & (df['M3: TEST DEP - INVOICED']) == 'No']

如果每列等于否",则分别写出

As oppossed to seperately writing out if each column equals "no"

以上代码错误:

TypeError: unsupported operand type(s) for &: 'str' and 'bool'

推荐答案

您可以这样做:

df[(df[['M1: PL - INVOICED','M2: EX - INVOICED','M3: TEST DEP - INVOICED']] == 'No')]

因此,您基本上可以传递感兴趣的列的列表,并将这些列与标量值进行比较,如果在任何地方都出现否"之后,请使用 any(axis = 1)

So you essentially pass a list of the cols of interest and compare just these columns against your scalar value, if you're after 'No' appearing anywhere then use any(axis=1)

In [115]:
df = pd.DataFrame({'a':'no', 'b':'yes', 'c':['yes','no','yes','no','no']})
df

Out[115]:
    a    b    c
0  no  yes  yes
1  no  yes   no
2  no  yes  yes
3  no  yes   no
4  no  yes   no

使用 any(axis = 1)时,它会返回所有在感兴趣的列中没有出现的行:

With any(axis=1) then it returns all rows where No appears in any of the cols of interest:

In [133]:    
df[(df[['a','c']] == 'no').any(axis=1)]

Out[133]:
    a    b    c
0  no  yes  yes
1  no  yes   no
2  no  yes  yes
3  no  yes   no
4  no  yes   no

您还可以使用掩码使用 dropna

You can also use the mask to drop NaN rows for a specific column using dropna

In [132]:    
df[df[['a','c']] == 'no'].dropna(subset=['c'])

Out[132]:
    a    b   c
1  no  NaN  no
3  no  NaN  no
4  no  NaN  no

这篇关于 pandas 使用单个条件过滤多列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆