使用多个布尔列过滤 pandas 数据框 [英] Filtering pandas dataframe with multiple Boolean columns

查看:65
本文介绍了使用多个布尔列过滤 pandas 数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用作为df一部分的几个布尔变量来过滤df,但一直未能做到.

I am trying to filter a df using several Boolean variables that are a part of the df, but have been unable to do so.

样本数据:

A | B | C | D
John Doe | 45 | True | False
Jane Smith | 32 | False | False
Alan Holmes | 55 | False | True
Eric Lamar | 29 | True | True

C和D列的dtype为布尔值.我想仅使用C或D为True的行创建一个新的df(df1).它应该看起来像这样:

The dtype for columns C and D is Boolean. I want to create a new df (df1) with only the rows where either C or D is True. It should look like this:

A | B | C | D
John Doe | 45 | True | False
Alan Holmes | 55 | False | True
Eric Lamar | 29 | True | True

我已经尝试过类似的事情,因为它无法处理布尔类型,因此会遇到问题:

I've tried something like this, which faces issues because it cant handle the Boolean type:

df1 = df[(df['C']=='True') or (df['D']=='True')]

有什么想法吗?

推荐答案

In [82]: d
Out[82]:
             A   B      C      D
0     John Doe  45   True  False
1   Jane Smith  32  False  False
2  Alan Holmes  55  False   True
3   Eric Lamar  29   True   True

解决方案1:

In [83]: d.loc[d.C | d.D]
Out[83]:
             A   B      C      D
0     John Doe  45   True  False
2  Alan Holmes  55  False   True
3   Eric Lamar  29   True   True

解决方案2:

In [94]: d[d[['C','D']].any(1)]
Out[94]:
             A   B      C      D
0     John Doe  45   True  False
2  Alan Holmes  55  False   True
3   Eric Lamar  29   True   True

解决方案3:

In [95]: d.query("C or D")
Out[95]:
             A   B      C      D
0     John Doe  45   True  False
2  Alan Holmes  55  False   True
3   Eric Lamar  29   True   True

PS如果将解决方案更改为:

PS If you change your solution to:

df[(df['C']==True) | (df['D']==True)]

它也会工作

熊猫文档-布尔索引

为什么我们不应该使用"PEP投诉" df["col_name"] is True代替df["col_name"] == True?

why we should NOT use "PEP complaint" df["col_name"] is True instead of df["col_name"] == True?

In [11]: df = pd.DataFrame({"col":[True, True, True]})

In [12]: df
Out[12]:
    col
0  True
1  True
2  True

In [13]: df["col"] is True
Out[13]: False               # <----- oops, that's not exactly what we wanted

这篇关于使用多个布尔列过滤 pandas 数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆