从不规则时间序列创建规则的 15 分钟时间序列 [英] Creating regular 15-minute time-series from irregular time-series

查看:47
本文介绍了从不规则时间序列创建规则的 15 分钟时间序列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 csv 文件中有一个不规则的时间序列(带有 DateTime 和 RainfallValue)C:\SampleData.csv:

<预><代码>日期时间,雨英寸2000 年 1 月 6 日 11:59,02000 年 1 月 6 日 23:59,0.012000 年 1 月 7 日 11:59,02000 年 1 月 13 日 23:59,02000 年 1 月 14 日 0:00,02000 年 1 月 14 日 23:59,02000 年 4 月 14 日 3:07,0.012000 年 4 月 14 日 3:12,0.032000 年 4 月 14 日 3:19,0.012001 年 12 月 31 日 22:44,012/31/2001 22:59,0.0712/31/2001 23:14,02001 年 12 月 31 日 23:29,02001 年 12 月 31 日 23:44,0.012001 年 12 月 31 日 23:59,0.01

注意:不规则的时间步长可能是 1 分钟、15 分钟、1 小时等.此外,在所需的 15 分钟间隔内可能会有多个观察.

我正在尝试创建一个从 2000-01-01 到 2001-12-31 的常规 15 分钟时间序列,它应该如下所示:

<预><代码>2000-01-01 00:15:00 0.002000-01-01 00:30:00 0.002000-01-01 00:45:00 0.00...2001-12-31 23:30:00 0.012001-12-31 23:45:00 0.01

注意:时间序列是有规律的,以 15 分钟为间隔,用 0 填充缺失的数据.如果在 15 分钟的间隔内有多个数据点,则将它们相加.

这是我的代码:

<预><代码>图书馆(动物园)图书馆(xts)文件名 = "C:\\SampleData.csv"ReadData <- read.zoo(filename, format = "%m/%d/%Y %H:%M", sep=",", tz="UTC", header=TRUE) # 读取 .csv 作为动物园对象RawData <-aggregate(ReadData, index(ReadData), sum) # 合并重复的时间戳并对相应的数据求和(注意)RawDataSeries <- as.xts(RawData,order.by =index(RawData)) #转换为XTS对象常规时间 <- seq(as.POSIXct("2000-01-01 00:00:00", tz = "UTC"), as.POSIXct("2001-12-31 23:45:00", tz = "UTC"), by = 60*15)BlankTimeSeries <- xts((rep(0,length(RegularTimes))),order.by = RegularTimes)MergedTimeSeries <- merge(RawDataSeries,BlankTimeSeries)TS_sum15min <- period.apply(MergedTimeSeries,endpoints(MergedTimeSeries, "minutes", 15), sum, na.rm = TRUE )TS_align15min <- align.time( TS_sum15min [endpoints(TS_sum15min, "minutes", 15)], n=60*15)

问题:输出时间序列TS_align15min:(a) 有重复的时间戳块(b) (神秘地)从 1999 年开始,如下:<代码><预>1999-12-31 19:15:00 01999-12-31 19:30:00 01999-12-31 19:45:00 01999-12-31 20:00:00 01999-12-31 20:15:00 01999-12-31 20:30:00 0

我做错了什么?

感谢您的指导!

解决方案

xts extends zoo,zoo 在它的小插图和文档中有大量的例子.
这是一个有效的例子.我想我过去做得更优雅,但这就是我现在想到的:

R>两小时 <- ISOdatetime(2012,05,02,9,0,0) + seq(0:7)*15*60R>两个小时[1] "2012-05-02 09:15:00 GMT" "2012-05-02 09:30:00 GMT"[3] "2012-05-02 09:45:00 GMT" "2012-05-02 10:00:00 GMT"[5] 2012-05-02 10:15:00 GMT" 2012-05-02 10:30:00 GMT"[7] "2012-05-02 10:45:00 GMT" "2012-05-02 11:00:00 GMT"R>set.seed(42)R>观察 <- xts(1:10, order.by=twohours[1]+cumsum(runif(10)*60*10))R>观察[,1]2012-05-02 09:24:08.883625 12012-05-02 09:33:31.128874 22012-05-02 09:36:22.812594 32012-05-02 09:44:41.081170 42012-05-02 09:51:06.128481 52012-05-02 09:56:17.586051 62012-05-02 10:03:39.539040 72012-05-02 10:05:00.338998 82012-05-02 10:11:34.534372 92012-05-02 10:18:37.573243 10

两个小时的时间网格,以及一些随机观察结果,一些单元格为空,一些单元格为空已满.

R>to.minutes15(观察)[,4]观察.关闭2012-05-02 09:24:08.883625 12012-05-02 09:44:41.081170 42012-05-02 09:56:17.586051 62012-05-02 10:11:34.534372 92012-05-02 10:18:37.573243 10

这是一个 15 分钟的网格聚合,但不在我们的时间网格中.

R>twoh <- xts(rep(NA,8), order.by=twohours)R>两个[,1]2012-05-02 09:15:00 不适用2012-05-02 09:30:00 不适用2012-05-02 09:45:00 不适用2012-05-02 10:00:00 北美2012-05-02 10:15:00 不适用2012-05-02 10:30:00 不适用2012-05-02 10:45:00 不适用2012-05-02 11:00:00 北美R>合并(两个,观察)二次观察2012-05-02 09:15:00.000000 不适用 不适用2012-05-02 09:24:08.883625 不适用 12012-05-02 09:30:00.000000 不适用 不适用2012-05-02 09:33:31.128874 NA 22012-05-02 09:36:22.812594 不适用 32012-05-02 09:44:41.081170 不适用 42012-05-02 09:45:00.000000 不适用 不适用2012-05-02 09:51:06.128481 NA 52012-05-02 09:56:17.586051 不适用 62012-05-02 10:00:00.000000 不适用 不适用2012-05-02 10:03:39.539040 不适用 72012-05-02 10:05:00.338998 北美 82012-05-02 10:11:34.534372 不适用 92012-05-02 10:15:00.000000 不适用 不适用2012-05-02 10:18:37.573243 不适用 102012-05-02 10:30:00.000000 不适用 不适用2012-05-02 10:45:00.000000 不适用 不适用2012-05-02 11:00:00.000000 不适用 不适用

新的 xts 对象和合并的对象.现在使用 na.locf() 携带观察前进:

R>na.locf(merge(twoh,观察)[,2])观察2012-05-02 09:15:00.000000 北美2012-05-02 09:24:08.883625 12012-05-02 09:30:00.000000 12012-05-02 09:33:31.128874 22012-05-02 09:36:22.812594 32012-05-02 09:44:41.081170 42012-05-02 09:45:00.000000 42012-05-02 09:51:06.128481 52012-05-02 09:56:17.586051 62012-05-02 10:00:00.000000 62012-05-02 10:03:39.539040 72012-05-02 10:05:00.338998 82012-05-02 10:11:34.534372 92012-05-02 10:15:00.000000 92012-05-02 10:18:37.573243 102012-05-02 10:30:00.000000 102012-05-02 10:45:00.000000 102012-05-02 11:00:00.000000 10

然后我们可以再次合并作为时间网格 xts 上的内连接 twoh:

R>合并(twoh,na.locf(合并(twoh,观察)[,2]),加入=内部")[,2]观察2012-05-02 09:15:00 不适用2012-05-02 09:30:00 12012-05-02 09:45:00 42012-05-02 10:00:00 62012-05-02 10:15:00 92012-05-02 10:30:00 102012-05-02 10:45:00 102012-05-02 11:00:00 10R>

I have an irregular time-series (with DateTime and RainfallValue) in a csv file C:\SampleData.csv:


DateTime,RainInches
1/6/2000 11:59,0
1/6/2000 23:59,0.01
1/7/2000 11:59,0
1/13/2000 23:59,0
1/14/2000 0:00,0
1/14/2000 23:59,0
4/14/2000 3:07,0.01
4/14/2000 3:12,0.03
4/14/2000 3:19,0.01
12/31/2001 22:44,0
12/31/2001 22:59,0.07
12/31/2001 23:14,0
12/31/2001 23:29,0
12/31/2001 23:44,0.01
12/31/2001 23:59,0.01

Note: The irregular time-steps could be 1 min, 15 min, 1 hour, etc. Also, there could be multiple observations in a desired 15-min interval.

I am trying to create a regular 15-minute time-series from 2000-01-01 to 2001-12-31 that should look like:


2000-01-01 00:15:00 0.00
2000-01-01 00:30:00 0.00
2000-01-01 00:45:00 0.00
...
2001-12-31 23:30:00 0.01
2001-12-31 23:45:00 0.01

Note: The time-series is regular with 15-minute intervals, filling the missing data with 0. If there are more than one data point in a 15 minute intervals, they are summed.

Here's is my code:


library(zoo)
library(xts)

filename = "C:\\SampleData.csv"
ReadData <- read.zoo(filename, format = "%m/%d/%Y %H:%M", sep=",", tz="UTC", header=TRUE) # read .csv as a ZOO object
RawData <- aggregate(ReadData, index(ReadData), sum) # Merge duplicate time stamps and SUM the corresponding data (CAUTION)
RawDataSeries <- as.xts(RawData,order.by =index(RawData)) #convert to an XTS object

RegularTimes <- seq(as.POSIXct("2000-01-01 00:00:00", tz = "UTC"), as.POSIXct("2001-12-31 23:45:00", tz = "UTC"), by = 60*15)
BlankTimeSeries <- xts((rep(0,length(RegularTimes))),order.by = RegularTimes)

MergedTimeSeries <- merge(RawDataSeries,BlankTimeSeries)
TS_sum15min <- period.apply(MergedTimeSeries,endpoints(MergedTimeSeries, "minutes", 15), sum, na.rm = TRUE )

TS_align15min <- align.time( TS_sum15min [endpoints(TS_sum15min , "minutes", 15)], n=60*15)

Problem: The output time series TS_align15min: (a) has repeating blocks of time-stamps (b) starts (mysteriously) from 1999, as:

1999-12-31 19:15:00    0
1999-12-31 19:30:00    0
1999-12-31 19:45:00    0
1999-12-31 20:00:00    0
1999-12-31 20:15:00    0
1999-12-31 20:30:00    0

What am I doing wrong?

Thank you for any direction!

解决方案

xts extends zoo, and zoo has extensive examples for this in its vignettes and documentation.
Here is a worked example. I think I have done that more elegantly in the past, but this is all I am coming up with now:

R> twohours <- ISOdatetime(2012,05,02,9,0,0) + seq(0:7)*15*60
R> twohours
[1] "2012-05-02 09:15:00 GMT" "2012-05-02 09:30:00 GMT" 
[3] "2012-05-02 09:45:00 GMT" "2012-05-02 10:00:00 GMT" 
[5] "2012-05-02 10:15:00 GMT" "2012-05-02 10:30:00 GMT" 
[7] "2012-05-02 10:45:00 GMT" "2012-05-02 11:00:00 GMT"
R> set.seed(42)
R> observation <- xts(1:10, order.by=twohours[1]+cumsum(runif(10)*60*10))
R> observation
                           [,1]
2012-05-02 09:24:08.883625    1
2012-05-02 09:33:31.128874    2
2012-05-02 09:36:22.812594    3
2012-05-02 09:44:41.081170    4
2012-05-02 09:51:06.128481    5
2012-05-02 09:56:17.586051    6
2012-05-02 10:03:39.539040    7
2012-05-02 10:05:00.338998    8
2012-05-02 10:11:34.534372    9
2012-05-02 10:18:37.573243   10

A two hour time grid, and some random observations leaving some cells empty and some filled.

R> to.minutes15(observation)[,4]
                           observation.Close
2012-05-02 09:24:08.883625                 1
2012-05-02 09:44:41.081170                 4
2012-05-02 09:56:17.586051                 6
2012-05-02 10:11:34.534372                 9
2012-05-02 10:18:37.573243                10

That is a 15 minutes grid aggregation but not on our time grid.

R> twoh <- xts(rep(NA,8), order.by=twohours)
R> twoh
                    [,1]
2012-05-02 09:15:00   NA
2012-05-02 09:30:00   NA
2012-05-02 09:45:00   NA
2012-05-02 10:00:00   NA
2012-05-02 10:15:00   NA
2012-05-02 10:30:00   NA
2012-05-02 10:45:00   NA
2012-05-02 11:00:00   NA

R> merge(twoh, observation)
                           twoh observation
2012-05-02 09:15:00.000000   NA          NA
2012-05-02 09:24:08.883625   NA           1
2012-05-02 09:30:00.000000   NA          NA
2012-05-02 09:33:31.128874   NA           2
2012-05-02 09:36:22.812594   NA           3
2012-05-02 09:44:41.081170   NA           4
2012-05-02 09:45:00.000000   NA          NA
2012-05-02 09:51:06.128481   NA           5
2012-05-02 09:56:17.586051   NA           6
2012-05-02 10:00:00.000000   NA          NA
2012-05-02 10:03:39.539040   NA           7
2012-05-02 10:05:00.338998   NA           8
2012-05-02 10:11:34.534372   NA           9
2012-05-02 10:15:00.000000   NA          NA
2012-05-02 10:18:37.573243   NA          10
2012-05-02 10:30:00.000000   NA          NA
2012-05-02 10:45:00.000000   NA          NA
2012-05-02 11:00:00.000000   NA          NA

New xts object, and merged object. Now use na.locf() to carry the observations forward:

R> na.locf(merge(twoh, observation)[,2])
                           observation
2012-05-02 09:15:00.000000          NA
2012-05-02 09:24:08.883625           1
2012-05-02 09:30:00.000000           1
2012-05-02 09:33:31.128874           2
2012-05-02 09:36:22.812594           3
2012-05-02 09:44:41.081170           4
2012-05-02 09:45:00.000000           4
2012-05-02 09:51:06.128481           5
2012-05-02 09:56:17.586051           6
2012-05-02 10:00:00.000000           6
2012-05-02 10:03:39.539040           7
2012-05-02 10:05:00.338998           8
2012-05-02 10:11:34.534372           9
2012-05-02 10:15:00.000000           9
2012-05-02 10:18:37.573243          10
2012-05-02 10:30:00.000000          10
2012-05-02 10:45:00.000000          10
2012-05-02 11:00:00.000000          10

And then we can merge again as an inner join on the time-grid xts twoh:

R> merge(twoh, na.locf(merge(twoh, observation)[,2]), join="inner")[,2]
                    observation
2012-05-02 09:15:00          NA
2012-05-02 09:30:00           1
2012-05-02 09:45:00           4
2012-05-02 10:00:00           6
2012-05-02 10:15:00           9
2012-05-02 10:30:00          10
2012-05-02 10:45:00          10
2012-05-02 11:00:00          10
R> 

这篇关于从不规则时间序列创建规则的 15 分钟时间序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆