在 pandas python中将一个表匹配并将值映射到另一个 [英] Match one table and map value to other in pandas python
本文介绍了在 pandas python中将一个表匹配并将值映射到另一个的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有两个熊猫数据框: df1:
I have two pandas dataframes: df1:
LT route_1 c2
PM/2 120 44
PM/52 110 49
PM/522 103 51
PM/522 103 51
PM/24 105 48
PM/536 109 67
PM/536 109 67
PM/5356 112 144
df2:
LT W_ID
PM/2 120.0
PM/52 110.0
PM/522 103.0
PM/522 103.0
PM/24 105.0
PM/536 109.0
PM/536 109.0
PM/5356 112.0
我需要将df2中的W_ID映射到df1中的route_1,以进行清除,替换,但是来自一个表的LT需要与来自另一个表的LT匹配. 所需的输出:
I need to map W_ID from df2 into route_1 from df1, to be clear, replace, but LT from one table need to match LT from another table. Desired output:
LT route_1 c2
PM/2 120.0 44
PM/52 110.0 49
PM/522 103.0 51
PM/522 103.0 51
PM/24 105.0 48
PM/536 109.0 67
PM/536 109.0 67
PM/5356 112.0 144
推荐答案
我认为 map
应该起作用:
I think map
should work:
df1['route_1'] = df1['LT'].map(df2.set_index('LT')['W_ID'])
不幸的是:
InvalidIndexError:仅对具有唯一值的索引对象有效的索引重新建立索引
InvalidIndexError: Reindexing only valid with uniquely valued Index objects
问题出在LT
列中的duplicates
.解决方案是通过 merge
:
Problem is with duplicates
in LT
column. Solution is add helper column by cumcount
for unique left join
by merge
:
df1['g'] = df1.groupby('LT').cumcount()
df2['g'] = df2.groupby('LT').cumcount()
df = pd.merge(df1, df2, on=['LT','g'], how='left')
print (df)
LT route_1 c2 g W_ID
0 PM/2 120 44 0 120.0
1 PM/52 110 49 0 110.0
2 PM/522 103 51 0 103.0
3 PM/522 103 51 1 103.0
4 PM/24 105 48 0 105.0
5 PM/536 109 67 0 109.0
6 PM/536 109 67 1 109.0
7 PM/5356 112 144 0 112.0
df1['route_1'] = df['W_ID']
df1.drop('g', axis=1, inplace=True)
print (df1)
LT route_1 c2
0 PM/2 120.0 44
1 PM/52 110.0 49
2 PM/522 103.0 51
3 PM/522 103.0 51
4 PM/24 105.0 48
5 PM/536 109.0 67
6 PM/536 109.0 67
7 PM/5356 112.0 144
类似的解决方案:
df1['g'] = df1.groupby('LT').cumcount()
df2['g'] = df2.groupby('LT').cumcount()
df = pd.merge(df1, df2, on=['LT','g'], how='left')
.drop(['g', 'route_1'], axis=1)
.rename(columns={'W_ID':'route_1'})
.reindex_axis(['LT', 'route_1', 'c2'], axis=1)
print (df)
LT route_1 c2
0 PM/2 120.0 44
1 PM/52 110.0 49
2 PM/522 103.0 51
3 PM/522 103.0 51
4 PM/24 105.0 48
5 PM/536 109.0 67
6 PM/536 109.0 67
7 PM/5356 112.0 144
这篇关于在 pandas python中将一个表匹配并将值映射到另一个的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文