合并两个不同长度的python pandas数据帧,但将所有行保留在输出数据帧中 [英] Merge two python pandas data frames of different length but keep all rows in output data frame

查看:660
本文介绍了合并两个不同长度的python pandas数据帧,但将所有行保留在输出数据帧中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下问题:我有两个不同长度的pandas数据帧,其中包含具有相同值的行和列,以及具有不同值的行和列,如下所示:

I have the following problem: I have two pandas data frames of different length containing some rows and columns that have common values and some that are different, like this:

df1:                                 df2:

      Column1  Column2  Column3           ColumnA  ColumnB ColumnC
    0    a        x        x            0    c        y       y
    1    c        x        x            1    e        z       z
    2    e        x        x            2    a        s       s
    3    d        x        x            3    d        f       f
    4    h        x        x
    5    k        x        x            

我现在要做的是合并两个数据帧,以便如果ColumnA和Column1具有相同的值,则将df2中的行附加到df1中的相应行,如下所示:

What I want to do now is merging the two dataframes so that if ColumnA and Column1 have the same value the rows from df2 are appended to the corresponding row in df1, like this:

df1:
    Column1  Column2  Column3  ColumnB  ColumnC
  0    a        x        x        s        s
  1    c        x        x        y        y
  2    e        x        x        z        z
  3    d        x        x        f        f
  4    h        x        x        NaN      NaN
  5    k        x        x        NaN      NaN

我知道通过合并是可行的

I know that the merge is doable through

df1.merge(df2,left_on='Column1', right_on='ColumnA')

但是此命令将删除两个文件中Column1和ColumnA中不同的所有行.取而代之的是,我想将这些行保留在df1中,并在其他行具有df2值的列中将NaN分配给它们,如上所示.在熊猫中,有没有一种平稳的方法?

but this command drops all rows that are not the same in Column1 and ColumnA in both files. Instead of that I want to keep these rows in df1 and just assign NaN to them in the columns where other rows have a value from df2, as shown above. Is there a smooth way to do this in pandas?

提前谢谢!

推荐答案

您可以在此处阅读文档:

You can read the documentation here: http://pandas.pydata.org/pandas-docs/stable/generated/pandas.DataFrame.merge.html

您要寻找的是左联接.默认选项是内部联接.您可以通过传递不同的how参数来更改此行为:

What you are looking for is a left join. The default option is an inner join. You can change this behavior by passing a different how argument:

df1.merge(df2,how='left', left_on='Column1', right_on='ColumnA')

这篇关于合并两个不同长度的python pandas数据帧,但将所有行保留在输出数据帧中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆