从不规则时间序列创建规则的 15 分钟时间序列 [英] Creating regular 15-minute time-series from irregular time-series
问题描述
我在 csv 文件中有一个不规则的时间序列(带有 DateTime 和 RainfallValue)C:\SampleData.csv
:
注意:不规则的时间步长可能是 1 分钟、15 分钟、1 小时等.此外,在所需的 15 分钟间隔内可能会有多个观察.
我正在尝试创建一个从 2000-01-01 到 2001-12-31 的常规 15 分钟时间序列,它应该如下所示:
<预><代码>2000-01-01 00:15:00 0.002000-01-01 00:30:00 0.002000-01-01 00:45:00 0.00...2001-12-31 23:30:00 0.012001-12-31 23:45:00 0.01注意:时间序列是有规律的,以 15 分钟为间隔,用 0 填充缺失的数据.如果在 15 分钟的间隔内有多个数据点,则将它们相加.
这是我的代码:
<预><代码>图书馆(动物园)图书馆(xts)文件名 = "C:\\SampleData.csv"ReadData <- read.zoo(filename, format = "%m/%d/%Y %H:%M", sep=",", tz="UTC", header=TRUE) # 读取 .csv 作为动物园对象RawData <-aggregate(ReadData, index(ReadData), sum) # 合并重复的时间戳并对相应的数据求和(注意)RawDataSeries <- as.xts(RawData,order.by =index(RawData)) #转换为XTS对象常规时间 <- seq(as.POSIXct("2000-01-01 00:00:00", tz = "UTC"), as.POSIXct("2001-12-31 23:45:00", tz = "UTC"), by = 60*15)BlankTimeSeries <- xts((rep(0,length(RegularTimes))),order.by = RegularTimes)MergedTimeSeries <- merge(RawDataSeries,BlankTimeSeries)TS_sum15min <- period.apply(MergedTimeSeries,endpoints(MergedTimeSeries, "minutes", 15), sum, na.rm = TRUE )TS_align15min <- align.time( TS_sum15min [endpoints(TS_sum15min, "minutes", 15)], n=60*15)问题:输出时间序列TS_align15min
:(a) 有重复的时间戳块(b) (神秘地)从 1999 年开始,如下:<代码><预>1999-12-31 19:15:00 01999-12-31 19:30:00 01999-12-31 19:45:00 01999-12-31 20:00:00 01999-12-31 20:15:00 01999-12-31 20:30:00 0
我做错了什么?
感谢您的指导!
xts extends zoo,zoo 在它的小插图和文档中有大量的例子.
这是一个有效的例子.我想我过去做得更优雅,但这就是我现在想到的:
R>两小时 <- ISOdatetime(2012,05,02,9,0,0) + seq(0:7)*15*60R>两个小时[1] "2012-05-02 09:15:00 GMT" "2012-05-02 09:30:00 GMT"[3] "2012-05-02 09:45:00 GMT" "2012-05-02 10:00:00 GMT"[5] 2012-05-02 10:15:00 GMT" 2012-05-02 10:30:00 GMT"[7] "2012-05-02 10:45:00 GMT" "2012-05-02 11:00:00 GMT"R>set.seed(42)R>观察 <- xts(1:10, order.by=twohours[1]+cumsum(runif(10)*60*10))R>观察[,1]2012-05-02 09:24:08.883625 12012-05-02 09:33:31.128874 22012-05-02 09:36:22.812594 32012-05-02 09:44:41.081170 42012-05-02 09:51:06.128481 52012-05-02 09:56:17.586051 62012-05-02 10:03:39.539040 72012-05-02 10:05:00.338998 82012-05-02 10:11:34.534372 92012-05-02 10:18:37.573243 10
两个小时的时间网格,以及一些随机观察结果,一些单元格为空,一些单元格为空已满.
R>to.minutes15(观察)[,4]观察.关闭2012-05-02 09:24:08.883625 12012-05-02 09:44:41.081170 42012-05-02 09:56:17.586051 62012-05-02 10:11:34.534372 92012-05-02 10:18:37.573243 10
这是一个 15 分钟的网格聚合,但不在我们的时间网格中.
R>twoh <- xts(rep(NA,8), order.by=twohours)R>两个[,1]2012-05-02 09:15:00 不适用2012-05-02 09:30:00 不适用2012-05-02 09:45:00 不适用2012-05-02 10:00:00 北美2012-05-02 10:15:00 不适用2012-05-02 10:30:00 不适用2012-05-02 10:45:00 不适用2012-05-02 11:00:00 北美R>合并(两个,观察)二次观察2012-05-02 09:15:00.000000 不适用 不适用2012-05-02 09:24:08.883625 不适用 12012-05-02 09:30:00.000000 不适用 不适用2012-05-02 09:33:31.128874 NA 22012-05-02 09:36:22.812594 不适用 32012-05-02 09:44:41.081170 不适用 42012-05-02 09:45:00.000000 不适用 不适用2012-05-02 09:51:06.128481 NA 52012-05-02 09:56:17.586051 不适用 62012-05-02 10:00:00.000000 不适用 不适用2012-05-02 10:03:39.539040 不适用 72012-05-02 10:05:00.338998 北美 82012-05-02 10:11:34.534372 不适用 92012-05-02 10:15:00.000000 不适用 不适用2012-05-02 10:18:37.573243 不适用 102012-05-02 10:30:00.000000 不适用 不适用2012-05-02 10:45:00.000000 不适用 不适用2012-05-02 11:00:00.000000 不适用 不适用
新的 xts 对象和合并的对象.现在使用 na.locf()
携带观察前进:
R>na.locf(merge(twoh,观察)[,2])观察2012-05-02 09:15:00.000000 北美2012-05-02 09:24:08.883625 12012-05-02 09:30:00.000000 12012-05-02 09:33:31.128874 22012-05-02 09:36:22.812594 32012-05-02 09:44:41.081170 42012-05-02 09:45:00.000000 42012-05-02 09:51:06.128481 52012-05-02 09:56:17.586051 62012-05-02 10:00:00.000000 62012-05-02 10:03:39.539040 72012-05-02 10:05:00.338998 82012-05-02 10:11:34.534372 92012-05-02 10:15:00.000000 92012-05-02 10:18:37.573243 102012-05-02 10:30:00.000000 102012-05-02 10:45:00.000000 102012-05-02 11:00:00.000000 10
然后我们可以再次合并作为时间网格 xts 上的内连接 twoh
:
R>合并(twoh,na.locf(合并(twoh,观察)[,2]),加入=内部")[,2]观察2012-05-02 09:15:00 不适用2012-05-02 09:30:00 12012-05-02 09:45:00 42012-05-02 10:00:00 62012-05-02 10:15:00 92012-05-02 10:30:00 102012-05-02 10:45:00 102012-05-02 11:00:00 10R>
I have an irregular time-series (with DateTime and RainfallValue) in a csv file C:\SampleData.csv
:
DateTime,RainInches
1/6/2000 11:59,0
1/6/2000 23:59,0.01
1/7/2000 11:59,0
1/13/2000 23:59,0
1/14/2000 0:00,0
1/14/2000 23:59,0
4/14/2000 3:07,0.01
4/14/2000 3:12,0.03
4/14/2000 3:19,0.01
12/31/2001 22:44,0
12/31/2001 22:59,0.07
12/31/2001 23:14,0
12/31/2001 23:29,0
12/31/2001 23:44,0.01
12/31/2001 23:59,0.01
Note: The irregular time-steps could be 1 min, 15 min, 1 hour, etc. Also, there could be multiple observations in a desired 15-min interval.
I am trying to create a regular 15-minute time-series from 2000-01-01 to 2001-12-31 that should look like:
2000-01-01 00:15:00 0.00
2000-01-01 00:30:00 0.00
2000-01-01 00:45:00 0.00
...
2001-12-31 23:30:00 0.01
2001-12-31 23:45:00 0.01
Note: The time-series is regular with 15-minute intervals, filling the missing data with 0. If there are more than one data point in a 15 minute intervals, they are summed.
Here's is my code:
library(zoo)
library(xts)
filename = "C:\\SampleData.csv"
ReadData <- read.zoo(filename, format = "%m/%d/%Y %H:%M", sep=",", tz="UTC", header=TRUE) # read .csv as a ZOO object
RawData <- aggregate(ReadData, index(ReadData), sum) # Merge duplicate time stamps and SUM the corresponding data (CAUTION)
RawDataSeries <- as.xts(RawData,order.by =index(RawData)) #convert to an XTS object
RegularTimes <- seq(as.POSIXct("2000-01-01 00:00:00", tz = "UTC"), as.POSIXct("2001-12-31 23:45:00", tz = "UTC"), by = 60*15)
BlankTimeSeries <- xts((rep(0,length(RegularTimes))),order.by = RegularTimes)
MergedTimeSeries <- merge(RawDataSeries,BlankTimeSeries)
TS_sum15min <- period.apply(MergedTimeSeries,endpoints(MergedTimeSeries, "minutes", 15), sum, na.rm = TRUE )
TS_align15min <- align.time( TS_sum15min [endpoints(TS_sum15min , "minutes", 15)], n=60*15)
Problem: The output time series TS_align15min
:
(a) has repeating blocks of time-stamps
(b) starts (mysteriously) from 1999, as:
1999-12-31 19:15:00 0 1999-12-31 19:30:00 0 1999-12-31 19:45:00 0 1999-12-31 20:00:00 0 1999-12-31 20:15:00 0 1999-12-31 20:30:00 0
What am I doing wrong?
Thank you for any direction!
xts extends zoo, and zoo has extensive examples for this in its vignettes and documentation.
Here is a worked example. I think I have done that more elegantly in the past, but this is all I am coming up with now:
R> twohours <- ISOdatetime(2012,05,02,9,0,0) + seq(0:7)*15*60
R> twohours
[1] "2012-05-02 09:15:00 GMT" "2012-05-02 09:30:00 GMT"
[3] "2012-05-02 09:45:00 GMT" "2012-05-02 10:00:00 GMT"
[5] "2012-05-02 10:15:00 GMT" "2012-05-02 10:30:00 GMT"
[7] "2012-05-02 10:45:00 GMT" "2012-05-02 11:00:00 GMT"
R> set.seed(42)
R> observation <- xts(1:10, order.by=twohours[1]+cumsum(runif(10)*60*10))
R> observation
[,1]
2012-05-02 09:24:08.883625 1
2012-05-02 09:33:31.128874 2
2012-05-02 09:36:22.812594 3
2012-05-02 09:44:41.081170 4
2012-05-02 09:51:06.128481 5
2012-05-02 09:56:17.586051 6
2012-05-02 10:03:39.539040 7
2012-05-02 10:05:00.338998 8
2012-05-02 10:11:34.534372 9
2012-05-02 10:18:37.573243 10
A two hour time grid, and some random observations leaving some cells empty and some filled.
R> to.minutes15(observation)[,4]
observation.Close
2012-05-02 09:24:08.883625 1
2012-05-02 09:44:41.081170 4
2012-05-02 09:56:17.586051 6
2012-05-02 10:11:34.534372 9
2012-05-02 10:18:37.573243 10
That is a 15 minutes grid aggregation but not on our time grid.
R> twoh <- xts(rep(NA,8), order.by=twohours)
R> twoh
[,1]
2012-05-02 09:15:00 NA
2012-05-02 09:30:00 NA
2012-05-02 09:45:00 NA
2012-05-02 10:00:00 NA
2012-05-02 10:15:00 NA
2012-05-02 10:30:00 NA
2012-05-02 10:45:00 NA
2012-05-02 11:00:00 NA
R> merge(twoh, observation)
twoh observation
2012-05-02 09:15:00.000000 NA NA
2012-05-02 09:24:08.883625 NA 1
2012-05-02 09:30:00.000000 NA NA
2012-05-02 09:33:31.128874 NA 2
2012-05-02 09:36:22.812594 NA 3
2012-05-02 09:44:41.081170 NA 4
2012-05-02 09:45:00.000000 NA NA
2012-05-02 09:51:06.128481 NA 5
2012-05-02 09:56:17.586051 NA 6
2012-05-02 10:00:00.000000 NA NA
2012-05-02 10:03:39.539040 NA 7
2012-05-02 10:05:00.338998 NA 8
2012-05-02 10:11:34.534372 NA 9
2012-05-02 10:15:00.000000 NA NA
2012-05-02 10:18:37.573243 NA 10
2012-05-02 10:30:00.000000 NA NA
2012-05-02 10:45:00.000000 NA NA
2012-05-02 11:00:00.000000 NA NA
New xts object, and merged object. Now use na.locf()
to carry the observations
forward:
R> na.locf(merge(twoh, observation)[,2])
observation
2012-05-02 09:15:00.000000 NA
2012-05-02 09:24:08.883625 1
2012-05-02 09:30:00.000000 1
2012-05-02 09:33:31.128874 2
2012-05-02 09:36:22.812594 3
2012-05-02 09:44:41.081170 4
2012-05-02 09:45:00.000000 4
2012-05-02 09:51:06.128481 5
2012-05-02 09:56:17.586051 6
2012-05-02 10:00:00.000000 6
2012-05-02 10:03:39.539040 7
2012-05-02 10:05:00.338998 8
2012-05-02 10:11:34.534372 9
2012-05-02 10:15:00.000000 9
2012-05-02 10:18:37.573243 10
2012-05-02 10:30:00.000000 10
2012-05-02 10:45:00.000000 10
2012-05-02 11:00:00.000000 10
And then we can merge again as an inner join on the time-grid xts twoh
:
R> merge(twoh, na.locf(merge(twoh, observation)[,2]), join="inner")[,2]
observation
2012-05-02 09:15:00 NA
2012-05-02 09:30:00 1
2012-05-02 09:45:00 4
2012-05-02 10:00:00 6
2012-05-02 10:15:00 9
2012-05-02 10:30:00 10
2012-05-02 10:45:00 10
2012-05-02 11:00:00 10
R>
这篇关于从不规则时间序列创建规则的 15 分钟时间序列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!