如何为 pandas 中的日期填充时间列的缺失时间戳 [英] How to fill missing timestamps for Time column for a date in pandas
本文介绍了如何为 pandas 中的日期填充时间列的缺失时间戳的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一个时间序列数据如下:
I have a time-series data as below:
print(df)
ric datel timel val
0 xyz 2017-01-01 09:00:00 2
1 xyz 2017-01-01 09:04:00 5
2 xyz 2017-01-01 09:37:00 6
现在我必须将缺失的时间戳填充到 09:45:00
.
Now I have to fill missing timestamps upto 09:45:00
.
预期输出:
ric datel timel val
0 xyz 2017-01-01 09:00:00 2
1 xyz 2017-01-01 09:01:00 nan
2 xyz 2017-01-01 09:02:00 nan
3 xyz 2017-01-01 09:03:00 nan
4 xyz 2017-01-01 09:04:00 5
...
...
37 xyz 2017-01-01 09:37:00 6
...
...
45 xyz 2017-01-01 09:45:00 nan
我的尝试:
df1=df.resample("1 min", on ='datel').first()
输出如下:
ric datel timel val
datel
2017-01-01 xyz 2017-01-01 09:00:00 2
并且还尝试使用 pd.date_range
但它主要适用于日期时间列.我有两个不同的日期和时间列.有没有办法在不将日期和列组合成日期时间的情况下实现这一点?
And also tried with pd.date_range
but it mostly works with datetime column.
I have two different columns date and time. Is there a way to achieve this without combining date and column into datetime?
推荐答案
主要思想是使用 reindex
由 time
创建,由 date_range
:
Main idea is use reindex
by time
s created by date_range
:
df['timel'] = pd.to_datetime(df['timel']).dt.time
start = pd.to_datetime(str(df['timel'].min()))
end = pd.to_datetime('09:45:00')
dates = pd.date_range(start=start, end=end, freq='1Min').time
#print (dates)
df = df.set_index('timel').reindex(dates).reset_index().reindex(columns=df.columns)
cols = df.columns.difference(['val'])
df[cols] = df[cols].ffill()
print (df.head())
ric datel timel val
0 xyz 2017-01-01 09:00:00 2.0
1 xyz 2017-01-01 09:01:00 NaN
2 xyz 2017-01-01 09:02:00 NaN
3 xyz 2017-01-01 09:03:00 NaN
4 xyz 2017-01-01 09:04:00 5.0
resample
的类似解决方案:
df['timel'] = pd.to_datetime(df['timel'])
#if missing row with 09:45:00 add it
if not (df['timel'] == pd.to_datetime('09:45:00')).any():
df.loc[len(df.index), 'timel'] = pd.to_datetime('09:45:00')
df=df.set_index('timel').resample("1min").first().reset_index().reindex(columns=df.columns)
cols = df.columns.difference(['val'])
df[cols] = df[cols].ffill()
df['timel'] = df['timel'].dt.time
print (df.head())
ric datel timel val
0 xyz 2017-01-01 09:00:00 2.0
1 xyz 2017-01-01 09:01:00 NaN
2 xyz 2017-01-01 09:02:00 NaN
3 xyz 2017-01-01 09:03:00 NaN
4 xyz 2017-01-01 09:04:00 5.0
这篇关于如何为 pandas 中的日期填充时间列的缺失时间戳的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文