使用多个布尔列过滤 pandas 数据框 [英] Filtering pandas dataframe with multiple Boolean columns
问题描述
我正在尝试使用作为df一部分的几个布尔变量来过滤df,但一直未能做到.
I am trying to filter a df using several Boolean variables that are a part of the df, but have been unable to do so.
样本数据:
A | B | C | D
John Doe | 45 | True | False
Jane Smith | 32 | False | False
Alan Holmes | 55 | False | True
Eric Lamar | 29 | True | True
C和D列的dtype为布尔值.我想仅使用C或D为True的行创建一个新的df(df1).它应该看起来像这样:
The dtype for columns C and D is Boolean. I want to create a new df (df1) with only the rows where either C or D is True. It should look like this:
A | B | C | D
John Doe | 45 | True | False
Alan Holmes | 55 | False | True
Eric Lamar | 29 | True | True
我已经尝试过类似的事情,因为它无法处理布尔类型,因此会遇到问题:
I've tried something like this, which faces issues because it cant handle the Boolean type:
df1 = df[(df['C']=='True') or (df['D']=='True')]
有什么想法吗?
推荐答案
In [82]: d
Out[82]:
A B C D
0 John Doe 45 True False
1 Jane Smith 32 False False
2 Alan Holmes 55 False True
3 Eric Lamar 29 True True
解决方案1:
In [83]: d.loc[d.C | d.D]
Out[83]:
A B C D
0 John Doe 45 True False
2 Alan Holmes 55 False True
3 Eric Lamar 29 True True
解决方案2:
In [94]: d[d[['C','D']].any(1)]
Out[94]:
A B C D
0 John Doe 45 True False
2 Alan Holmes 55 False True
3 Eric Lamar 29 True True
解决方案3:
In [95]: d.query("C or D")
Out[95]:
A B C D
0 John Doe 45 True False
2 Alan Holmes 55 False True
3 Eric Lamar 29 True True
PS如果将解决方案更改为:
PS If you change your solution to:
df[(df['C']==True) | (df['D']==True)]
它也会工作
为什么我们不应该使用"PEP投诉"
df["col_name"] is True
代替df["col_name"] == True
?
why we should NOT use "PEP complaint"
df["col_name"] is True
instead ofdf["col_name"] == True
?
In [11]: df = pd.DataFrame({"col":[True, True, True]})
In [12]: df
Out[12]:
col
0 True
1 True
2 True
In [13]: df["col"] is True
Out[13]: False # <----- oops, that's not exactly what we wanted
这篇关于使用多个布尔列过滤 pandas 数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!