不区分大小写的 pandas dataframe.merge [英] Case insensitive pandas dataframe.merge

查看:580
本文介绍了不区分大小写的 pandas dataframe.merge的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在用最简单的方法在熊猫中进行不区分大小写的合并.有没有一种方法可以在合并中正确执行?我是否需要使用(?i)或带有正则表达式的正则表达式?在下面的代码段中,我加入了一些国家,其中一个文件中可能是美国",而另一个文件中可能是美国",我只想从等式中脱颖而出.谢谢!

I am struggling with the easiest way to do a case insensitive merge in pandas. Is there a way to do it right on the merge? Do I need to use (?i) or a regex with ignorecase? In my code snippet below I am joining some Countries where it may be "United States" in one file and "UNITED STATES" in another and I just want to take the case out of the equation. Thank you!

import pandas as pd
import csv
import sys

env_path = sys.argv[1]
map_path = sys.argv[2]


df_address = pd.read_csv(env_path + "\\address.csv")
df_CountryMapping = pd.read_csv(map_path + "\CountryMapping.csv")

df_merged = df_address.merge(df_CountryMapping, left_on="Country", right_on="NAME", how="left")

....

推荐答案

在将用于合并的两列中的值小写,然后在小写的列中合并

Lowercase the values in the two columns that will be used to merge, and then merge on the lowercased columns

df_address['country_lower'] = df_address['Country'].str.lower()
df_CountryMapping['name_lower'] = df_CountryMapping['NAME'].str.lower()
df_merged = df_address.merge(df_CountryMapping, left_on="country_lower", right_on="name_lower", how="left")

这篇关于不区分大小写的 pandas dataframe.merge的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆