将数据从一个 pandas 数据框替换到另一个 [英] Replace data from one pandas dataframe to another

查看:59
本文介绍了将数据从一个 pandas 数据框替换到另一个的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有两个数据帧df1和df2.它们都包含时间序列数据,因此df1和df2中的某些日期可能彼此相交,而其他日期则不相交.我的要求是对两个数据帧进行一次操作,用相同日期的df2中的值替换df1中的值,仅留下df2中不存在的df1中的索引值,并添加df2中而不是df1中的索引值.考虑以下示例:

I have two dataframes df1 and df2 . They both contain time-series data, so it is possible some of the dates in df1 and df2 intersect with each other and the rest don't. My requirement is an operation on the two dataframes that replaces the values in df1 with the values in df2 for the same dates, leaves alone values for indexes in df1 not present in df2 and adds the values for indexes present in df2 and not in df1. Consider the following example:

df1:
    A   B   C   D
0   A0  BO  C0  D0
1   A1  B1  C1  D1
2   A2  B2  C2  D2
3   A3  B3  C3  D3

df2:
    A   B   C   E
1   A4  B4  C4  E4
2   A5  B5  C5  E5
3   A6  B6  C6  E6
4   A7  B7  C7  E7

result df:
    A   B   C   D   E
0   A0  BO  C0  D0  Nan
1   A4  B4  C4  D4  E4
2   A5  B5  C5  D5  E5
3   A6  B6  C6  D6  E6
4   A7  B7  C7  D7  E7

我试图通过第一步将两个df连接起来来开发逻辑,但是这导致具有重复索引的行,并且不确定如何处理. 如何做到这一点?任何建议都会有帮助

I tried to develop the logic with the first step concatenating the two dfs but that leads to rows with duplicate indexes and am not sure how to handle that. How can this be achieved? Any suggestions would help

一种更简单的情况是,两个数据框中的列名相同.因此,考虑到df2具有D列而不是E列,其值为D4,D5,D6,D7.

A simpler case would be when the column names are same in the two dataframes. So consider df2 has column D instead of E with values D4,D5,D6,D7.

串联产生以下结果:

concat(df1,df2,axis=1)
    A    B    C    D    A    B    C    D
0   A0   B0   C0   D0  NaN  NaN  NaN  NaN  
1   A1   B1   C1   D1   A4   B4   C4   D4
2   A2   B2   C2   D2   A5   B5   C5   D5
3   A3   B3   C3   D3   A6   B6   C6   D6
4  NaN  NaN  NaN  NaN   A7   B7   C7   D7

现在,这将引入重复的列.常规的解决方案是遍历每列,但我正在寻找一种更优雅的解决方案.任何想法将不胜感激.

Now this introduces duplicate columns. A conventional solution would be to loop through each column but I am looking for a more elegant solution. Any ideas would be appreciated.

推荐答案

update 将与两个DataFrame的索引对齐:

update will align on the indices of both DataFrames:

df1.update(df2)

df1:
    A   B   C   D
0   A0  BO  C0  D0
1   A1  B1  C1  D1
2   A2  B2  C2  D2
3   A3  B3  C3  D3

df2:
    A   B   C   D
1   A4  B4  C4  D4
2   A5  B5  C5  D5
3   A6  B6  C6  D6
4   A7  B7  C7  D7

>>> df1.update(df2)
    A   B   C   D
0  A0  BO  C0  D0
1  A4  B4  C4  D4
2  A5  B5  C5  D5
3  A6  B6  C6  D6

然后,您需要添加df1中不存在的df2中的值:

You then need to add the values in df2 not present in df1:

>>> df1.append(df2.loc[[i for i in df2.index if i not in df1.index], :])
Out[46]: 
    A   B   C   D
0  A0  BO  C0  D0
1  A4  B4  C4  D4
2  A5  B5  C5  D5
3  A6  B6  C6  D6
4  A7  B7  C7  D7

这篇关于将数据从一个 pandas 数据框替换到另一个的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆