合并数据框而无需在python pandas中复制行 [英] Merge dataframes without duplicating rows in python pandas

查看:47
本文介绍了合并数据框而无需在python pandas中复制行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想使用两个相似的列"A"合并两个数据框:

I'd like to combine two dataframes using their similar column 'A':

>>> df1
    A   B
0   I   1
1   I   2
2   II  3

>>> df2
    A   C
0   I   4
1   II  5
2   III 6

为此,我尝试使用:

合并= pd.merge(df1,df2,on ='A',how ='outer')

merged = pd.merge(df1, df2, on='A', how='outer')

哪个返回:

>>> merged
    A   B   C
0   I   1.0 4
1   I   2.0 4
2   II  3.0 5
3   III NaN 6

但是,由于df2仅包含A =='I'的一个值,因此我不希望在合并的数据帧中重复该值.相反,我想要以下输出:

However, since df2 only contained one value for A == 'I', I do not want this value to be duplicated in the merged dataframe. Instead I would like the following output:

>>> merged
    A   B   C
0   I   1.0 4
1   I   2.0 NaN
2   II  3.0 5
3   III NaN 6

做到这一点的最佳方法是什么?我是python的新手,但仍然对所有的join/merge/concatenate/append操作感到困惑.

What is the best way to do this? I am new to python and still slightly confused with all the join/merge/concatenate/append operations.

推荐答案

让我们通过 cumcount

df1['g']=df1.groupby('A').cumcount()
df2['g']=df2.groupby('A').cumcount()
df1.merge(df2,how='outer').drop('g',1)
Out[62]: 
     A    B    C
0    I  1.0  4.0
1    I  2.0  NaN
2   II  3.0  5.0
3  III  NaN  6.0

这篇关于合并数据框而无需在python pandas中复制行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆