pandas 加入问题:列重叠但未指定后缀 [英] Pandas join issue: columns overlap but no suffix specified
本文介绍了 pandas 加入问题:列重叠但未指定后缀的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有以下2个数据帧:
df_a =
mukey DI PI
0 100000 35 14
1 1000005 44 14
2 1000006 44 14
3 1000007 43 13
4 1000008 43 13
df_b =
mukey niccdcd
0 190236 4
1 190237 6
2 190238 7
3 190239 4
4 190240 7
当我尝试加入这两个数据框时:
When I try to join these 2 dataframes:
join_df = df_a.join(df_b,on='mukey',how='left')
我得到了错误:
*** ValueError: columns overlap but no suffix specified: Index([u'mukey'], dtype='object')
为什么会这样?数据框确实具有通用的"mukey"值.
Why is this so? The dataframes do have common 'mukey' values.
推荐答案
您在发布的数据片段中的错误有点含糊,因为没有通用值,所以联接操作失败,因为这些值没有重叠,则需要为左右两侧提供后缀:
Your error on the snippet of data you posted is a little cryptic, in that because there are no common values, the join operation fails because the values don't overlap it requires you to supply a suffix for the left and right hand side:
In [173]:
df_a.join(df_b, on='mukey', how='left', lsuffix='_left', rsuffix='_right')
Out[173]:
mukey_left DI PI mukey_right niccdcd
index
0 100000 35 14 NaN NaN
1 1000005 44 14 NaN NaN
2 1000006 44 14 NaN NaN
3 1000007 43 13 NaN NaN
4 1000008 43 13 NaN NaN
merge
之所以有效,是因为它没有此限制:
merge
works because it doesn't have this restriction:
In [176]:
df_a.merge(df_b, on='mukey', how='left')
Out[176]:
mukey DI PI niccdcd
0 100000 35 14 NaN
1 1000005 44 14 NaN
2 1000006 44 14 NaN
3 1000007 43 13 NaN
4 1000008 43 13 NaN
这篇关于 pandas 加入问题:列重叠但未指定后缀的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文