将缺失的时间戳行添加到数据帧 [英] Add missing timestamp row to a dataframe

查看:86
本文介绍了将缺失的时间戳行添加到数据帧的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据框,其中包含每天以两小时为间隔测量的数据,但是缺少一些时间间隔.我的数据集如下所示:

I have a dataframe which contains data that were measured at two hours interval each day, some time intervals are however missing. My dataset looks like below:

2020-12-01 08:00:00 145.9
2020-12-01 10:00:00 100.0
2020-12-01 16:00:00 99.3
2020-12-01 18:00:00 91.0

我正在尝试插入缺失的时间间隔并用 Nan 填充它们的值.

I'm trying to insert the missing time intervals and fill their value with Nan.

2020-12-01 08:00:00 145.9
2020-12-01 10:00:00 100.0
2020-12-01 12:00:00 Nan
2020-12-01 14:00:00 Nan
2020-12-01 16:00:00 99.3
2020-12-01 18:00:00 91.0

我将感谢有关如何在 python 中实现这一目标的任何帮助,因为我是一个刚开始使用 python 的新手

I will appreciate any help on how to achieve this in python as i'm a newbie starting out with python

推荐答案

假设你的 df 看起来像

assuming your df looks like

              datetime  value
0  2020-12-01T08:00:00  145.9
1  2020-12-01T10:00:00  100.0
2  2020-12-01T16:00:00   99.3
3  2020-12-01T18:00:00   91.0

确保 datetime 列是 dtype datetime;

make sure datetime column is dtype datetime;

df['datetime'] = pd.to_datetime(df['datetime'])

以便您现在可以重新采样到每 2 小时一次的频率:

so that you can now resample to 2-hourly frequency:

df.resample('2H', on='datetime').mean()

                     value
datetime                  
2020-12-01 08:00:00  145.9
2020-12-01 10:00:00  100.0
2020-12-01 12:00:00    NaN
2020-12-01 14:00:00    NaN
2020-12-01 16:00:00   99.3
2020-12-01 18:00:00   91.0

请注意,如果您的 df 已有日期时间索引,则无需设置 on= 关键字.重采样产生的 df 将有一个日期时间索引.

Note that you don't need to set the on= keyword if your df already has a datetime index. The df resulting from resampling will have a datetime index.

另请注意,我使用 .mean() 作为 aggfunc,这意味着如果您在两个小时的时间间隔内有多个值,您将获得平均值.

Also note that I'm using .mean() as aggfunc, meaning that if you have multiple values within the two hour intervals, you'll get the mean of that.

这篇关于将缺失的时间戳行添加到数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆