根据输入其他列的组合定义pandas列 [英] Defining pandas column based on combination of input other columns

查看:74
本文介绍了根据输入其他列的组合定义pandas列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想基于现有列中的值在我的pandas数据框中创建一个新列.新列的输入应为boolean.目前,我正在尝试以下操作:

I want to create a new column in my pandas dataframe based on values in already existing columns. The input of the new column should be boolean. At the moment I am trying the following:

import pandas as pd

df_edit = pd.DataFrame({'Included': [False, False, True, False], 'Update 
Check': [True, True, True, True], 'duplicate_fname': [True, False, False, 
False], 'duplicate_targetfname': [False, False, False, False]})

df_edit['test'] = df_edit['Included'] == False & 
df_edit['Update Check'] == True & (df_edit['duplicate_fname'] == True | 
df_edit['duplicate_targetfname'] == True)

当我尝试这样做时,我收到一个ValueError,指出以下内容:

When I try to do it like this I get a ValueError stating the following:

ValueError:系列的真值不明确.使用a.empty,a.bool(),> a.item(),a.any()或a.all().

ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), >a.item(), a.any() or a.all().

还有另一种方法吗?

我的预期输出将是一个包含以下值的列:

My expected output would be a column that consists of the following values:

True, False, False, False

推荐答案

不包含括号的问题:

df_edit['test'] = (df_edit['Included'] == False) & \
                  (df_edit['Update Check'] == True) & \
                  ((df_edit['duplicate_fname'] == True) | 
                   (df_edit['duplicate_targetfname'] == True))

print (df_edit)
   Included  Update Check  duplicate_fname  duplicate_targetfname   test
0     False          True             True                  False   True
1     False          True            False                  False  False
2      True          True            False                  False  False
3     False          True            False                  False  False

但是更好的方法是使用~反转布尔掩码,并省略与True s的比较:

But better is use ~ for invert boolean mask and omit compare with Trues:

df_edit['test'] = ~df_edit['Included'] & 
                   df_edit['Update Check'] & 
                   (df_edit['duplicate_fname'] | df_edit['duplicate_targetfname'])
print (df_edit)

   Included  Update Check  duplicate_fname  duplicate_targetfname   test
0     False          True             True                  False   True
1     False          True            False                  False  False
2      True          True            False                  False  False
3     False          True            False                  False  False

这篇关于根据输入其他列的组合定义pandas列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆