在 pandas python中将一个表匹配并将值映射到另一个 [英] Match one table and map value to other in pandas python

查看:892
本文介绍了在 pandas python中将一个表匹配并将值映射到另一个的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个熊猫数据框: df1:

I have two pandas dataframes: df1:

LT     route_1 c2
PM/2     120   44
PM/52    110   49
PM/522   103   51
PM/522   103   51
PM/24    105   48
PM/536   109   67
PM/536   109   67
PM/5356  112   144 

df2:

LT       W_ID 
PM/2     120.0
PM/52    110.0
PM/522   103.0
PM/522   103.0
PM/24    105.0
PM/536   109.0
PM/536   109.0
PM/5356  112.0

我需要将df2中的W_ID映射到df1中的route_1,以进行清除,替换,但是来自一个表的LT需要与来自另一个表的LT匹配. 所需的输出:

I need to map W_ID from df2 into route_1 from df1, to be clear, replace, but LT from one table need to match LT from another table. Desired output:

LT     route_1   c2
PM/2     120.0   44
PM/52    110.0   49
PM/522   103.0   51
PM/522   103.0   51
PM/24    105.0   48
PM/536   109.0   67
PM/536   109.0   67
PM/5356  112.0   144 

推荐答案

我认为 map 应该起作用:

I think map should work:

df1['route_1'] = df1['LT'].map(df2.set_index('LT')['W_ID'])

不幸的是:

InvalidIndexError:仅对具有唯一值的索引对象有效的索引重新建立索引

InvalidIndexError: Reindexing only valid with uniquely valued Index objects

问题出在LT列中的duplicates.解决方案是通过 通过 merge :

Problem is with duplicates in LT column. Solution is add helper column by cumcount for unique left join by merge:

df1['g'] = df1.groupby('LT').cumcount()
df2['g'] = df2.groupby('LT').cumcount()
df = pd.merge(df1, df2, on=['LT','g'], how='left')
print (df)
        LT  route_1   c2  g   W_ID
0     PM/2      120   44  0  120.0
1    PM/52      110   49  0  110.0
2   PM/522      103   51  0  103.0
3   PM/522      103   51  1  103.0
4    PM/24      105   48  0  105.0
5   PM/536      109   67  0  109.0
6   PM/536      109   67  1  109.0
7  PM/5356      112  144  0  112.0

df1['route_1'] = df['W_ID']
df1.drop('g', axis=1, inplace=True)
print (df1)
        LT  route_1   c2
0     PM/2    120.0   44
1    PM/52    110.0   49
2   PM/522    103.0   51
3   PM/522    103.0   51
4    PM/24    105.0   48
5   PM/536    109.0   67
6   PM/536    109.0   67
7  PM/5356    112.0  144

类似的解决方案:

df1['g'] = df1.groupby('LT').cumcount()
df2['g'] = df2.groupby('LT').cumcount()
df = pd.merge(df1, df2, on=['LT','g'], how='left')
       .drop(['g', 'route_1'], axis=1)
       .rename(columns={'W_ID':'route_1'})
       .reindex_axis(['LT', 'route_1', 'c2'], axis=1)
print (df)
        LT  route_1   c2
0     PM/2    120.0   44
1    PM/52    110.0   49
2   PM/522    103.0   51
3   PM/522    103.0   51
4    PM/24    105.0   48
5   PM/536    109.0   67
6   PM/536    109.0   67
7  PM/5356    112.0  144

这篇关于在 pandas python中将一个表匹配并将值映射到另一个的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆