pandas 使用单个条件过滤多列 [英] Pandas filter mulitple columns with single criteria
问题描述
我有一个100列以上的Excel工作表.我需要过滤其中大约五个,以查看其中一个单元格中哪些列为"no".有没有一种方法可以使用单个搜索条件来过滤多列,例如:
I have an excel sheet with over a hundred columns. I need to filter about five of these to see which columns have "no" in one of the cells. Is there a way to filter multiple columns with a single search criteria such as:
no_invoice_filter = df[(df['M1: PL - INVOICED']) & (df['M2: EX - INVOICED']) & (df['M3: TEST DEP - INVOICED']) == 'No']
如果每列等于否",则分别写出
As oppossed to seperately writing out if each column equals "no"
以上代码错误:
TypeError: unsupported operand type(s) for &: 'str' and 'bool'
推荐答案
您可以这样做:
df[(df[['M1: PL - INVOICED','M2: EX - INVOICED','M3: TEST DEP - INVOICED']] == 'No')]
因此,您基本上可以传递感兴趣的列的列表,并将这些列与标量值进行比较,如果在任何地方都出现否"之后,请使用 any(axis = 1)
So you essentially pass a list of the cols of interest and compare just these columns against your scalar value, if you're after 'No' appearing anywhere then use any(axis=1)
In [115]:
df = pd.DataFrame({'a':'no', 'b':'yes', 'c':['yes','no','yes','no','no']})
df
Out[115]:
a b c
0 no yes yes
1 no yes no
2 no yes yes
3 no yes no
4 no yes no
使用 any(axis = 1)
时,它会返回所有在感兴趣的列中没有出现的行:
With any(axis=1)
then it returns all rows where No appears in any of the cols of interest:
In [133]:
df[(df[['a','c']] == 'no').any(axis=1)]
Out[133]:
a b c
0 no yes yes
1 no yes no
2 no yes yes
3 no yes no
4 no yes no
您还可以使用掩码使用 dropna
You can also use the mask to drop NaN rows for a specific column using dropna
In [132]:
df[df[['a','c']] == 'no'].dropna(subset=['c'])
Out[132]:
a b c
1 no NaN no
3 no NaN no
4 no NaN no
这篇关于 pandas 使用单个条件过滤多列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!