在不同的列名称上合并两个不同的数据框 [英] Merge two different dataframes on different column names

查看:128
本文介绍了在不同的列名称上合并两个不同的数据框的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个数据框,

df1 = pd.DataFrame({'A': ['A1', 'A1', 'A2', 'A3'],
                     'B': ['121', '345', '123', '146'],
                     'C': ['K0', 'K1', 'K0', 'K1']})

df2 = pd.DataFrame({'A': ['A1', 'A3'],
                      'BB': ['B0', 'B3'],
                      'CC': ['121', '345'],
                      'DD': ['D0', 'D1']})

现在我需要从df1的A列和B列以及从df2的A列和CC列获得相似的行. 因此,我尝试了可能的合并选项,例如:

Now I need to get the similiar rows from column A and B from df1 and column A and CC from df2. And so I tried possible merge options, such as:

both_DFS=pd.merge(df1,df2, how='left',left_on=['A','B'],right_on=['A','CC'])

,这将不会为我提供来自df2数据帧的行信息.意思是,我拥有df2中的所有列名,但行只是空或Nan.

and this will not give me row information from df2 dataframe which is what I needed. Meaning, I have all column names from df2 but the rows are just empty or Nan.

然后我尝试:

Both_DFs=pd.merge(df1,df2, how='left',left_on=['A','B'],right_on=['A','CC'])[['A','B','CC']]

这给了我错误,

KeyError: "['B'] not in index"

我的目标是合并具有df1和df2中所有列的Dataframe.任何建议都很好

I am aiming to have a merged Dataframe with all columns from both df1 and df2. Any suggestions would be great

所需的输出:

 Both_DFs
    A   B   C   BB  CC  DD
0   A1  121 K0  B0  121 D0

因此,在我的数据帧(df1和df2)中,只有一行与目标两列都完全匹配.也就是说,df1中的A和B列只有一行与df2中A和CC列中的行完全匹配

So in my data frames (df1 and df2), only one row has exact match for both columns of interest. That is, Column A and B from df1 has only one row matching exactly to rows in columns A and CC in df2

推荐答案

好吧,如果您将列A声明为索引,它将起作用:

Well, if you declare column A as index, it works:

Both_DFs = pd.merge(df1.set_index('A', drop=True),df2.set_index('A', drop=True), how='left',left_on=['B'],right_on=['CC'], left_index=True, right_index=True).dropna().reset_index()

结果是:

    A    B   C  BB   CC  DD
0  A1  123  K0  B0  121  D0
1  A1  345  K1  B0  121  D0
2  A3  146  K1  B3  345  D1

编辑

您只需要:

Both_DFs = pd.merge(df1,df2, how='left',left_on=['A','B'],right_on=['A','CC']).dropna()

哪个给:

    A    B   C  BB   CC  DD
0  A1  121  K0  B0  121  D0

这篇关于在不同的列名称上合并两个不同的数据框的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆