Pandas join 问题:列重叠但未指定后缀 [英] Pandas join issue: columns overlap but no suffix specified

查看:79
本文介绍了Pandas join 问题:列重叠但未指定后缀的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下数据框:

print(df_a)
     mukey  DI  PI
0   100000  35  14
1  1000005  44  14
2  1000006  44  14
3  1000007  43  13
4  1000008  43  13

print(df_b)
    mukey  niccdcd
0  190236        4
1  190237        6
2  190238        7
3  190239        4
4  190240        7

当我尝试加入这些数据框时:

When I try to join these data frames:

join_df = df_a.join(df_b, on='mukey', how='left')

我收到错误:

*** ValueError: columns overlap but no suffix specified: Index([u'mukey'], dtype='object')

为什么会这样?数据框确实有共同的 'mukey' 值.

Why is this so? The data frames do have common 'mukey' values.

推荐答案

您发布的数据片段上的错误有点神秘,因为没有通用值,连接操作失败,因为值不重叠它需要您为左侧和右侧提供后缀:

Your error on the snippet of data you posted is a little cryptic, in that because there are no common values, the join operation fails because the values don't overlap it requires you to supply a suffix for the left and right hand side:

In [173]:

df_a.join(df_b, on='mukey', how='left', lsuffix='_left', rsuffix='_right')
Out[173]:
       mukey_left  DI  PI  mukey_right  niccdcd
index                                          
0          100000  35  14          NaN      NaN
1         1000005  44  14          NaN      NaN
2         1000006  44  14          NaN      NaN
3         1000007  43  13          NaN      NaN
4         1000008  43  13          NaN      NaN

merge 有效,因为它没有这个限制:

merge works because it doesn't have this restriction:

In [176]:

df_a.merge(df_b, on='mukey', how='left')
Out[176]:
     mukey  DI  PI  niccdcd
0   100000  35  14      NaN
1  1000005  44  14      NaN
2  1000006  44  14      NaN
3  1000007  43  13      NaN
4  1000008  43  13      NaN

这篇关于Pandas join 问题:列重叠但未指定后缀的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆