如何在 pandas 中合并两个数据框以替换nan [英] How to merge two dataframe in pandas to replace nan
本文介绍了如何在 pandas 中合并两个数据框以替换nan的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我想在熊猫里做这件事
我有两个数据框A和B,我只想用B值替换A的NaN.
I have 2 dataframes, A and B, I want to replace only NaN of A with B values.
A
2014-04-17 12:59:00 146.06250 146.0625 145.93750 145.93750
2014-04-17 13:00:00 145.90625 145.9375 145.87500 145.90625
2014-04-17 13:01:00 145.90625 NaN 145.90625 NaN
2014-04-17 13:02:00 NaN NaN 145.93750 145.96875
B
2014-04-17 12:59:00 146 2/32 146 2/32 145 30/32 145 30/32
2014-04-17 13:00:00 145 29/32 145 30/32 145 28/32 145 29/32
2014-04-17 13:01:00 145 29/32 146 145 29/32 147
2014-04-17 13:02:00 146 146 145 30/32 145 31/32
Result:
2014-04-17 12:59:00 146.06250 146.0625 145.93750 145.93750
2014-04-17 13:00:00 145.90625 145.9375 145.87500 145.90625
2014-04-17 13:01:00 145.90625 146 145.90625 147
2014-04-17 13:02:00 146 146 145.93750 145.96875
提前谢谢
推荐答案
为此而提倡的官方方法是A.combine_first(B)
.有关更多信息,请参见官方文档.
The official way promoted exactly to do this is A.combine_first(B)
. Further information are in the official documentation.
但是,它在A.fillna(B)
的大型数据库中(执行的测试包含25000个元素)的性能大大超过了
However, it gets outperformed massively with large databases from A.fillna(B)
(performed tests with 25000 elements):
In[891]: %timeit df.fillna(df2)
1000 loops, best of 3: 333 µs per loop
In[892]: %timeit df.combine_first(df2)
100 loops, best of 3: 2.15 ms per loop
In[894]: (df.fillna(df2) == df.combine_first(df2)).all().all()
Out[890]: True
这篇关于如何在 pandas 中合并两个数据框以替换nan的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文