根据其他两个列的相等性创建一个新列 [英] Creating a new column depending on the equality of two other columns

查看:61
本文介绍了根据其他两个列的相等性创建一个新列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想比较两列的值,我在其中创建了新列bin_crnn.如果它们等于1,我想要1,否则等于0.

# coding: utf-8
import pandas as pd

df = pd.read_csv('file.csv',sep=',')

if df['crnn_pred']==df['manual_raw_value']:
    df['bin_crnn']=1
else:
    df['bin_crnn']=0

我遇到了以下错误

    if df['crnn_pred']==df['manual_raw_value']:
  File "/home/ahmed/anaconda3/envs/cv/lib/python2.7/site-packages/pandas/core/generic.py", line 917, in __nonzero__
    .format(self.__class__.__name__))
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

解决方案

您需要使用 all 或者 any 用于返回标量False.

我认为更好地解释此答案.

l want to compare the values of two columns where I create a new column bin_crnn. I want 1 if they are equals or 0 if not.

# coding: utf-8
import pandas as pd

df = pd.read_csv('file.csv',sep=',')

if df['crnn_pred']==df['manual_raw_value']:
    df['bin_crnn']=1
else:
    df['bin_crnn']=0

l got the following error

    if df['crnn_pred']==df['manual_raw_value']:
  File "/home/ahmed/anaconda3/envs/cv/lib/python2.7/site-packages/pandas/core/generic.py", line 917, in __nonzero__
    .format(self.__class__.__name__))
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), a.item(), a.any() or a.all().

解决方案

You need cast boolean mask to int with astype:

df['bin_crnn'] = (df['crnn_pred']==df['manual_raw_value']).astype(int)

Sample:

df = pd.DataFrame({'crnn_pred':[1,2,5], 'manual_raw_value':[1,8,5]})
print (df)
   crnn_pred  manual_raw_value
0          1                 1
1          2                 8
2          5                 5

print (df['crnn_pred']==df['manual_raw_value'])
0     True
1    False
2     True
dtype: bool

df['bin_crnn'] = (df['crnn_pred']==df['manual_raw_value']).astype(int)
print (df)
   crnn_pred  manual_raw_value  bin_crnn
0          1                 1         1
1          2                 8         0
2          5                 5         1

You get error, because if compare columns output is not scalar, but Series (array) of True and False values.

So need all or any for return scalar True or False.

I think better it explain this answer.

这篇关于根据其他两个列的相等性创建一个新列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆