如何在 pandas 中合并两个数据框以替换nan [英] How to merge two dataframe in pandas to replace nan

查看:103
本文介绍了如何在 pandas 中合并两个数据框以替换nan的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在熊猫里做这件事

我有两个数据框A和B,我只想用B值替换A的NaN.

I have 2 dataframes, A and B, I want to replace only NaN of A with B values.

A                                                
2014-04-17 12:59:00  146.06250  146.0625  145.93750  145.93750
2014-04-17 13:00:00  145.90625  145.9375  145.87500  145.90625
2014-04-17 13:01:00  145.90625       NaN  145.90625        NaN
2014-04-17 13:02:00        NaN       NaN  145.93750  145.96875

B
2014-04-17 12:59:00   146 2/32   146 2/32  145 30/32  145 30/32
2014-04-17 13:00:00  145 29/32  145 30/32  145 28/32  145 29/32
2014-04-17 13:01:00  145 29/32        146  145 29/32        147
2014-04-17 13:02:00        146        146  145 30/32  145 31/32

Result:
2014-04-17 12:59:00  146.06250  146.0625  145.93750  145.93750
2014-04-17 13:00:00  145.90625  145.9375  145.87500  145.90625
2014-04-17 13:01:00  145.90625       146  145.90625        147
2014-04-17 13:02:00        146       146  145.93750  145.96875

提前谢谢

推荐答案

为此而提倡的官方方法是A.combine_first(B).有关更多信息,请参见官方文档.

The official way promoted exactly to do this is A.combine_first(B). Further information are in the official documentation.

但是,它在A.fillna(B)的大型数据库中(执行的测试包含25000个元素)的性能大大超过了

However, it gets outperformed massively with large databases from A.fillna(B) (performed tests with 25000 elements):

In[891]: %timeit df.fillna(df2)
1000 loops, best of 3: 333 µs per loop
In[892]: %timeit df.combine_first(df2)
100 loops, best of 3: 2.15 ms per loop
In[894]: (df.fillna(df2) == df.combine_first(df2)).all().all()
Out[890]: True

这篇关于如何在 pandas 中合并两个数据框以替换nan的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆