将缺失的时间戳行添加到数据帧 [英] Add missing timestamp row to a dataframe
问题描述
我有一个数据框,其中包含每天以两小时为间隔测量的数据,但是缺少一些时间间隔.我的数据集如下所示:
I have a dataframe which contains data that were measured at two hours interval each day, some time intervals are however missing. My dataset looks like below:
2020-12-01 08:00:00 145.9
2020-12-01 10:00:00 100.0
2020-12-01 16:00:00 99.3
2020-12-01 18:00:00 91.0
我正在尝试插入缺失的时间间隔并用 Nan 填充它们的值.
I'm trying to insert the missing time intervals and fill their value with Nan.
2020-12-01 08:00:00 145.9
2020-12-01 10:00:00 100.0
2020-12-01 12:00:00 Nan
2020-12-01 14:00:00 Nan
2020-12-01 16:00:00 99.3
2020-12-01 18:00:00 91.0
我将感谢有关如何在 python 中实现这一目标的任何帮助,因为我是一个刚开始使用 python 的新手
I will appreciate any help on how to achieve this in python as i'm a newbie starting out with python
推荐答案
假设你的 df 看起来像
assuming your df looks like
datetime value
0 2020-12-01T08:00:00 145.9
1 2020-12-01T10:00:00 100.0
2 2020-12-01T16:00:00 99.3
3 2020-12-01T18:00:00 91.0
确保 datetime 列是 dtype datetime;
make sure datetime column is dtype datetime;
df['datetime'] = pd.to_datetime(df['datetime'])
以便您现在可以重新采样到每 2 小时一次的频率:
so that you can now resample to 2-hourly frequency:
df.resample('2H', on='datetime').mean()
value
datetime
2020-12-01 08:00:00 145.9
2020-12-01 10:00:00 100.0
2020-12-01 12:00:00 NaN
2020-12-01 14:00:00 NaN
2020-12-01 16:00:00 99.3
2020-12-01 18:00:00 91.0
请注意,如果您的 df 已有日期时间索引,则无需设置 on=
关键字.重采样产生的 df 将有一个日期时间索引.
Note that you don't need to set the on=
keyword if your df already has a datetime index. The df resulting from resampling will have a datetime index.
另请注意,我使用 .mean()
作为 aggfunc,这意味着如果您在两个小时的时间间隔内有多个值,您将获得平均值.
Also note that I'm using .mean()
as aggfunc, meaning that if you have multiple values within the two hour intervals, you'll get the mean of that.
这篇关于将缺失的时间戳行添加到数据帧的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!