Pandas:回填缺失数据并保留索引 [英] Pandas: backfilling missig Data and keeping the index

查看:67
本文介绍了Pandas:回填缺失数据并保留索引的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据集有很多缺失值,时间间隔是 5 分钟,但是,也有很多缺失的时间戳.数据框看起来像这样:

I have a dataset with many missing value the time interval is 5min, however, there are many missing timestamp as well. Dataframe look like this:

Time                   A
2000-01-01 00:00:00   NaN
2010-01-01 00:00:00   NaN
2015-01-01 00:00:00   NaN
2015-12-01 00:00:00   NaN
2015-12-01 12:40:00   NaN
2015-12-01 12:45:00   NaN


df.dropna().head(6)

Time                    A
2015-12-04 11:50:00    1.0
2016-04-11 16:15:00    1.0
2016-04-11 16:25:00    1.0
2016-04-29 22:05:00    1.0
2016-07-01 14:25:00    1.0
2016-07-23 21:20:00    1.0

我想在不更改索引的情况下回填缺失值 10 天.我使用了这个命令,但结果没有变化.

I want to back fill the missing values for 10 days without changing the index. I used this command but there is no change in the results.

#fill the missing data
df_filled=df.groupby(df.index).fillna(method='bfill', limit=12*240)

df_filled.dropna().head(6)

Time                    A
2015-12-04 11:50:00    1.0
2016-04-11 16:15:00    1.0
2016-04-11 16:25:00    1.0
2016-04-29 22:05:00    1.0
2016-07-01 14:25:00    1.0
2016-07-23 21:20:00    1.0

如果有人能指导我,我很感激.

I appreciate if any one can guide me.

提前致谢.

来自 df 的值的快照:

a snapshot of one the values from df:

12/4/2015 11:15 NaN
12/4/2015 11:20 NaN
12/4/2015 11:25 NaN
12/4/2015 11:30 NaN
12/4/2015 11:35 NaN
12/4/2015 11:40 NaN
12/4/2015 11:45 NaN
12/4/2015 11:50 1

我想要回填最多 10 天的数据,所以对于相同的数据点应该是:

What I want to backfill data up to 10 days so for the same data point it should be:

12/4/2015 11:15 1
12/4/2015 11:20 1
12/4/2015 11:25 1
12/4/2015 11:30 1
12/4/2015 11:35 1
12/4/2015 11:40 1
12/4/2015 11:45 1
12/4/2015 11:50 1

推荐答案

我找到了一个有效的解决方案:

I found a solution that works:

df_filled=df.groupby(pd.Grouper(freq='10D')).fillna(method='bfill')

这篇关于Pandas:回填缺失数据并保留索引的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆