基于布尔条件的Pandas数据框中的新列 [英] New column in Pandas dataframe based on boolean conditions

查看:54
本文介绍了基于布尔条件的Pandas数据框中的新列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想根据每个特定行中的其他值,为填充有True或False的Pandas数据框创建一个新列.我解决此任务的方法是对数据框中的每一行应用一个检查布尔条件的函数,并用True或False填充新列.

I'd like to create a new column to a Pandas dataframe populated with True or False based on the other values in each specific row. My approach to solve this task was to apply a function checking boolean conditions across each row in the dataframe and populate the new column with either True or False.

这是数据框:

l={'DayTime':['2018-03-01','2018-03-02','2018-03-03'],'Pressure':
[9,10.5,10.5], 'Feed':[9,10.5,11], 'Temp':[9,10.5,11]}

df1=pd.DataFrame(l)

这是我写的功能:

def ops_on(row):
   return row[('Feed' > 10)
              & ('Pressure' > 10)
              & ('Temp' > 10)
             ]

函数ops_on用于创建新列['ops_on']:

The function ops_on is used to create the new column ['ops_on']:

df1['ops_on'] = df1.apply(ops_on, axis='columns')

不幸的是,我收到此错误消息:

Unfortunately, I get this error message:

TypeError :("str"和"int"的实例之间不支持">",发生在索引0"))

TypeError: ("'>' not supported between instances of 'str' and 'int'", 'occurred at index 0')

感谢您的帮助.

推荐答案

您应该按列(矢量化,高效)工作,而不应按行(低效,Python循环)工作:

You should work column-wise (vectorised, efficient) rather than row-wise (inefficient, Python loop):

df1['ops_on'] = (df1['Feed'] > 10) & (df1['Pressure'] > 10) & (df1['Temp'] > 10)

& (和")运算符按元素应用于布尔系列.可以链接任意数量的此类条件.

The & ("and") operator is applied to Boolean series element-wise. An arbitrary number of such conditions can be chained.

或者,对于多次执行相同比较的特殊情况:

Alternatively, for the special case where you are performing the same comparison multiple times:

df1['ops_on'] = df1[['Feed', 'Pressure', 'Temp']].gt(10).all(1)

这篇关于基于布尔条件的Pandas数据框中的新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆