根据输入其他列的组合定义pandas列 [英] Defining pandas column based on combination of input other columns
问题描述
我想基于现有列中的值在我的pandas数据框中创建一个新列.新列的输入应为boolean
.目前,我正在尝试以下操作:
I want to create a new column in my pandas dataframe based on values in already existing columns. The input of the new column should be boolean
. At the moment I am trying the following:
import pandas as pd
df_edit = pd.DataFrame({'Included': [False, False, True, False], 'Update
Check': [True, True, True, True], 'duplicate_fname': [True, False, False,
False], 'duplicate_targetfname': [False, False, False, False]})
df_edit['test'] = df_edit['Included'] == False &
df_edit['Update Check'] == True & (df_edit['duplicate_fname'] == True |
df_edit['duplicate_targetfname'] == True)
当我尝试这样做时,我收到一个ValueError,指出以下内容:
When I try to do it like this I get a ValueError stating the following:
ValueError:系列的真值不明确.使用a.empty,a.bool(),> a.item(),a.any()或a.all().
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), >a.item(), a.any() or a.all().
还有另一种方法吗?
我的预期输出将是一个包含以下值的列:
My expected output would be a column that consists of the following values:
True, False, False, False
推荐答案
不包含括号的问题:
df_edit['test'] = (df_edit['Included'] == False) & \
(df_edit['Update Check'] == True) & \
((df_edit['duplicate_fname'] == True) |
(df_edit['duplicate_targetfname'] == True))
print (df_edit)
Included Update Check duplicate_fname duplicate_targetfname test
0 False True True False True
1 False True False False False
2 True True False False False
3 False True False False False
但是更好的方法是使用~
反转布尔掩码,并省略与True
s的比较:
But better is use ~
for invert boolean mask and omit compare with True
s:
df_edit['test'] = ~df_edit['Included'] &
df_edit['Update Check'] &
(df_edit['duplicate_fname'] | df_edit['duplicate_targetfname'])
print (df_edit)
Included Update Check duplicate_fname duplicate_targetfname test
0 False True True False True
1 False True False False False
2 True True False False False
3 False True False False False
这篇关于根据输入其他列的组合定义pandas列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!