Pandas:回填缺失数据并保留索引 [英] Pandas: backfilling missig Data and keeping the index
本文介绍了Pandas:回填缺失数据并保留索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个数据集有很多缺失值,时间间隔是 5 分钟,但是,也有很多缺失的时间戳.数据框看起来像这样:
I have a dataset with many missing value the time interval is 5min, however, there are many missing timestamp as well. Dataframe look like this:
Time A
2000-01-01 00:00:00 NaN
2010-01-01 00:00:00 NaN
2015-01-01 00:00:00 NaN
2015-12-01 00:00:00 NaN
2015-12-01 12:40:00 NaN
2015-12-01 12:45:00 NaN
df.dropna().head(6)
Time A
2015-12-04 11:50:00 1.0
2016-04-11 16:15:00 1.0
2016-04-11 16:25:00 1.0
2016-04-29 22:05:00 1.0
2016-07-01 14:25:00 1.0
2016-07-23 21:20:00 1.0
我想在不更改索引的情况下回填缺失值 10 天.我使用了这个命令,但结果没有变化.
I want to back fill the missing values for 10 days without changing the index. I used this command but there is no change in the results.
#fill the missing data
df_filled=df.groupby(df.index).fillna(method='bfill', limit=12*240)
df_filled.dropna().head(6)
Time A
2015-12-04 11:50:00 1.0
2016-04-11 16:15:00 1.0
2016-04-11 16:25:00 1.0
2016-04-29 22:05:00 1.0
2016-07-01 14:25:00 1.0
2016-07-23 21:20:00 1.0
如果有人能指导我,我很感激.
I appreciate if any one can guide me.
提前致谢.
来自 df 的值的快照:
a snapshot of one the values from df:
12/4/2015 11:15 NaN
12/4/2015 11:20 NaN
12/4/2015 11:25 NaN
12/4/2015 11:30 NaN
12/4/2015 11:35 NaN
12/4/2015 11:40 NaN
12/4/2015 11:45 NaN
12/4/2015 11:50 1
我想要回填最多 10 天的数据,所以对于相同的数据点应该是:
What I want to backfill data up to 10 days so for the same data point it should be:
12/4/2015 11:15 1
12/4/2015 11:20 1
12/4/2015 11:25 1
12/4/2015 11:30 1
12/4/2015 11:35 1
12/4/2015 11:40 1
12/4/2015 11:45 1
12/4/2015 11:50 1
推荐答案
我找到了一个有效的解决方案:
I found a solution that works:
df_filled=df.groupby(pd.Grouper(freq='10D')).fillna(method='bfill')
这篇关于Pandas:回填缺失数据并保留索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文