如何比较两个不同列之间的 pandas 值? [英] How to compare values in pandas between two different columns?

查看:93
本文介绍了如何比较两个不同列之间的 pandas 值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的桌子:

A           Country     Code1           Code2
626349      US          640AD1237       407223
702747      NaN         IO1062123       407255
824316      US          NaN             NaN
712947      US          00220221        870262123
278147      Canada      721AC31234      109123
278144      Canada      NaN             7214234321
278142      Canada      72142QW134      109123AS12

在上表中,我需要检查国家和代码.

我想要第5列带有正确或错误的伪代码:

If 'Country' == 'US' and (length(Code1) OR length(Code2) == 9):
    Add values to 5th column as correct.
else:
    Add values to 5th column as incorrect.

If 'Country' == 'Canada' and (length(Code1) OR length(Code2) == 10):
    Add values to 5th column as correct.
else:
    Add values to 5th column as incorrect.

如果国家/地区"或代码"列中没有值,而信息不足.

我无法理解应该如何在熊猫中进行此操作.请帮忙.谢谢.

我试图首先找到Code1和Code2的行长,并将其存储在不同的df中,但是此后,我无法根据需要比较不同的数据集.

Len1 = df.Code1.map(len)
Len2 = df.Code2.map(len)
LengthCode = pd.DataFrame({'Len_Code1': Len1,'Len_Code2': Len2})

请告诉我更好的方法,如果可能的话,在单个数据框中执行此操作.

我尝试过

df[(df.Country == 'US') & ((df.Code1.str.len() == 9)|(df.Code2.str.len() == 9))|(df.Country == 'Canada') & ((df.Code1.str.len() == 10)|(df.Code2.str.len() == 10))]

但是它已经很长了,我将无法为许多国家/地区写信.

解决方案

这将为您提供一个'is_correct'布尔值列:

code_lengths = {'US':9, 'Canada':10}
df['correct_code_length'] = df.Country.replace(code_lengths)
df['is_correct'] = (df.Code1.apply(lambda x: len(str(x))) == df.correct_code_length) | (df.Code2.apply(lambda x: len(str(x))) == df.correct_code_length)

您需要根据需要在code_lengths词典中填充更多国家/地区.

My Table:

A           Country     Code1           Code2
626349      US          640AD1237       407223
702747      NaN         IO1062123       407255
824316      US          NaN             NaN
712947      US          00220221        870262123
278147      Canada      721AC31234      109123
278144      Canada      NaN             7214234321
278142      Canada      72142QW134      109123AS12

Here in the above table I need to check country and code.

I want a 5th column with correct or wrong, pseudocode:

If 'Country' == 'US' and (length(Code1) OR length(Code2) == 9):
    Add values to 5th column as correct.
else:
    Add values to 5th column as incorrect.

If 'Country' == 'Canada' and (length(Code1) OR length(Code2) == 10):
    Add values to 5th column as correct.
else:
    Add values to 5th column as incorrect.

if no values are there either in Country or Code Column than insufficient information.

I am not able to understand how should I do this in pandas. Please help. Thanks.

I tried to first find the length of rows of Code1 and Code2 and store it in different df but after that I am not able to Compare the different set of data as what I need to do.

Len1 = df.Code1.map(len)
Len2 = df.Code2.map(len)
LengthCode = pd.DataFrame({'Len_Code1': Len1,'Len_Code2': Len2})

Please tell me the better way of how to do this in single dataframe if possible.

I tried this

df[(df.Country == 'US') & ((df.Code1.str.len() == 9)|(df.Code2.str.len() == 9))|(df.Country == 'Canada') & ((df.Code1.str.len() == 10)|(df.Code2.str.len() == 10))]

But it is getting long and I will not be able to write for many countries.

解决方案

This will give you a 'is_correct' boolean column:

code_lengths = {'US':9, 'Canada':10}
df['correct_code_length'] = df.Country.replace(code_lengths)
df['is_correct'] = (df.Code1.apply(lambda x: len(str(x))) == df.correct_code_length) | (df.Code2.apply(lambda x: len(str(x))) == df.correct_code_length)

You will need to populate the code_lengths dictionary with more countries as necessary.

这篇关于如何比较两个不同列之间的 pandas 值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆