在ts和window功能中使用每小时数据 [英] Use Hourly data in ts and window function
问题描述
我有每小时的数据,例如下面的示例,我试图从中创建时间序列并使用window函数.我的最终目标是使用它来训练Arima模型.我很难使用ts()
或window()
处理日期时间格式.我也尝试使用msts()
,但无法使其与日期时间格式一起使用.我已经xts()
可以工作了,但是window()
或Arima()
似乎不能正常工作.
I have hourly data like the sample below that I am trying to create a time-series from and use the window function with. My end goal is to use this to train an Arima model. I'm having a hard time getting ts()
or window()
to work with my date time format. I've also tried using msts()
but couldn't get it to work with the date time format. I have gotten xts()
to work, but it doesn't seem to work correctly with the window()
or Arima()
.
是否可以在ts()
和window()
函数中使用此日期时间格式?任何提示都将不胜感激.
Is it possible to use this date time format with ts()
and the window()
function? Any tips are greatly appreciated.
代码:
tsData <- ts(SampleData$MedTime[1:24],start='2015-01-01 00:00', frequency=168)
train <- window(tsData,end='2015-01-01 15:00')
编辑说明:从提供的最初525个开始,该问题的数据已被截断为仅24个观测值.结果,window()
调用也已修改为截断范围内的时间.
Edit Note The data for this problem has been truncated to only 24 observations from the initial 525 provided. As a result, the window()
call has been modified as well to a time within the truncated range.
数据:
dput(SampleData[1:24,c("DateTime","MedTime")])
SampleData = structure(list(DateTime = c("2015-01-01 00:00","2015-01-01 01:00", "2015-01-01 02:00","2015-01-01 03:00","2015-01-01 04:00","2015-01-01 05:00", "2015-01-01 06:00","2015-01-01 07:00","2015-01-01 08:00","2015-01-01 09:00", "2015-01-01 10:00","2015-01-01 11:00","2015-01-01 12:00","2015-01-01 13:00", "2015-01-01 14:00","2015-01-01 15:00","2015-01-01 16:00","2015-01-01 17:00", "2015-01-01 18:00","2015-01-01 19:00","2015-01-01 20:00","2015-01-01 21:00", "2015-01-01 22:00","2015-01-01 23:00"),MedTime = c(11,14, 17,5,5,5.5,8,NA,5.5,6.5,8.5,4,5,9,10,11,7,6,7, 7,5,6,9,9)),.names = c("DateTime","MedTime"),row.names = c(NA, 24L),class ="data.frame")
SampleData = structure(list(DateTime = c("2015-01-01 00:00", "2015-01-01 01:00", "2015-01-01 02:00", "2015-01-01 03:00", "2015-01-01 04:00", "2015-01-01 05:00", "2015-01-01 06:00", "2015-01-01 07:00", "2015-01-01 08:00", "2015-01-01 09:00", "2015-01-01 10:00", "2015-01-01 11:00", "2015-01-01 12:00", "2015-01-01 13:00", "2015-01-01 14:00", "2015-01-01 15:00", "2015-01-01 16:00", "2015-01-01 17:00", "2015-01-01 18:00", "2015-01-01 19:00", "2015-01-01 20:00", "2015-01-01 21:00", "2015-01-01 22:00", "2015-01-01 23:00"), MedTime = c(11, 14, 17, 5, 5, 5.5, 8, NA, 5.5, 6.5, 8.5, 4, 5, 9, 10, 11, 7, 6, 7, 7, 5, 6, 9, 9)), .Names = c("DateTime", "MedTime"), row.names = c(NA, 24L), class = "data.frame")
推荐答案
R中的时间序列
ts()
对象具有一些限制.最值得注意的是,它不接受每次观察的时间戳.而是,它请求一个start
和freq
(end
是可选的).此外,freq
功能仅限于按季节查看数据.
Time Series in R
The ts()
object has a few limitations. Most notably, it doesn't accept time stamps per observation. Instead, it requests a start
and freq
(the end
is optional). Furthermore, the freq
capabilities are limited to viewing data in terms of seasons.
Type Frequency
Annual 1
Quarterly 4
Monthly 12
Weekly 52
因此,要生成正确的季节",我们将必须计算每日季节性,其中freq=1440
(= 24 * 60).之后,它变得更加复杂.
Thus, to generate the correct "season" we would have to calculate a daily seasonality where freq=1440
(=24*60). It gets a bit more complicated after that.
因此,我强烈建议使用xts
或zoo
对象创建时间序列.
As a result, I would highly suggest creating the time series with an xts
or zoo
object.
接下来,出现窗口问题的原因之一是您提供的日期是字符串,而不是 POSIXct 或 POSIXlt 目的.优先考虑.
Next up, one of the reasons for your windowing issues is the date you are supplying is a string and not a POSIXct or POSIXlt object. The prior of which is preferred.
可以找到完整的细分:
as.POSIXct之间的差异/as.POSIXlt和strptime,用于将字符向量转换为POSIXct/POSIXlt
话虽如此,第一步是将数据从字符格式转换为 POSIXct
With that being said, one of the first steps is to convert your data from character form to POSIXct
# Convert to POSXICT
SampleData$DateTime = as.POSIXct(strptime(SampleData$DateTime, format ="%Y-%m-%d %H:%M"))
开窗
从那里开始,如果我们创建xts()
对象,则窗口问题变得微不足道.
Windowing
From there, the windowing issue becomes trivial if we create a xts()
object.
# install.packages("xts")
require(xts)
# Create an XTS object to hold the time series
sdts = xts(SampleData$MedTime, order.by = SampleData$DateTime)
# Subset training
train = window(sdts,end= as.POSIXct('2015-01-21 23:00', format ="%Y-%m-%d %H:%M"))
这篇关于在ts和window功能中使用每小时数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!