映射2个数据帧并替换目标数据帧中匹配值的标头 [英] Mapping 2 dataframes and replacing header of matched values in target dataframe
本文介绍了映射2个数据帧并替换目标数据帧中匹配值的标头的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个数据框:df1
I have a dataframe: df1
SAP_Name SAP_Class SAP_Sec
Avi 5 C
Rison 6 A
Slesh 7 B
San 8 C
Sud 7 B
df2:
Name_Fi Class
Avi 5
Rison 6
Slesh 7
我正在尝试将df2与df1匹配,以便匹配的值应将标头替换为df1 。
I am trying to match df2 to df1 such that the matching values should have the headers replaced same as df1.
SAP_Name SAP_Class
Ankan 5
Rison 6
Slesh 7
下面是我正在使用的代码:
Below is the code which I am using :
d = {}
for col2 in df2.columns:
for col1 in df1.columns:
cond = df2[col2].isin(df1[col1]).all()
if cond:
d[col2] = col1
df2 = df2.rename(columns=d)
print (df2)
我能为了在一个小文件中获得所需的输出,但是我的实际文件有112444行×446列,要更改的目标文件有3行×35列,在这种情况下,代码运行了很长时间。有人可以在这里帮助我吗?
I am able to get the desired output in a small file, however My actual file has 112444 rows × 446 columns and the target file to be changed has 3 rows × 35 columns , the code is running for a long long time in this case. Can anyone please help me here?
推荐答案
我认为如果性能很重要,请使用 issubset $设置了
的c $ c>:
In my opinion if performance is important use issubset
with set
:
d = {}
for col2 in df2.columns:
for col1 in df1.columns:
cond = set(df2[col2]).issubset(df1[col1])
if cond:
d[col2] = col1
df2 = df2.rename(columns=d)
print (df2)
SAP_Name SAP_Class
0 Avi 5
1 Rison 6
2 Slesh 7
编辑:
#create dictioanry of Series without dupes
dfs1 = {col1: df1[col1].drop_duplicates() for col1 in df1.columns}
#print (dfs1)
#create dictionary of sets
set2 = {col2: set(df2[col2]) for col2 in df2.columns}
#print (set2)
#loop buy both dictionaries and find columns for rename
d = {}
for col2, v2 in set2.items():
for col1, v1 in dfs1.items():
cond = v2.issubset(v1)
if cond:
d[col2] = col1
df2 = df2.rename(columns=d)
print (df2)
SAP_Name SAP_Class
0 Avi 5
1 Rison 6
2 Slesh 7
这篇关于映射2个数据帧并替换目标数据帧中匹配值的标头的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文