在不规则时间序列上滚动窗口 [英] Rolling window over irregular time series
问题描述
我有一个使用 xts
的不规则时间序列事件(帖子),我想计算在滚动的每周窗口(或每两周或 3 天等)内发生的事件数量.数据如下所示:
I have an irregular time series of events (posts) using xts
, and I want to calculate the number of events that occur over a rolling weekly window (or biweekly, or 3 day, etc). The data looks like this:
postid
2010-08-04 22:28:07 867
2010-08-04 23:31:12 891
2010-08-04 23:58:05 901
2010-08-05 08:35:50 991
2010-08-05 13:28:02 1085
2010-08-05 14:14:47 1114
2010-08-05 14:21:46 1117
2010-08-05 15:46:24 1151
2010-08-05 16:25:29 1174
2010-08-05 23:19:29 1268
2010-08-06 12:15:42 1384
2010-08-06 15:22:06 1403
2010-08-07 10:25:49 1550
2010-08-07 18:58:16 1596
2010-08-07 21:15:44 1608
应该产生类似的东西
nposts
2010-08-05 00:00:00 10
2010-08-06 00:00:00 9
2010-08-07 00:00:00 5
为期 2 天的窗口.我研究了 rollapply
、apply.rolling
来自 PerformanceAnalytics
等,它们都假设有规律的时间序列数据.我尝试将所有时间更改为帖子发生的那一天,并使用诸如 ddply
之类的东西对每一天进行分组,这让我很接近.但是,用户可能不会每天都发帖,因此时间序列仍然是不规则的.我可以用 0 来填补空白,但这可能会使我的数据膨胀很多,而且已经相当大了.
for a 2-day window. I have looked into rollapply
, apply.rolling
from PerformanceAnalytics
, etc, and they all assume regular time series data. I tried changing all of the times to just the day the the post occurred and using something like ddply
to group on each day, which gets me close. However, a user might not post every day, so the time series will still be irregular. I could fill in the gaps with 0s, but that might inflate my data a lot and it's already quite large.
我该怎么办?
推荐答案
这是一个使用 xts 的解决方案:
Here's a solution using xts:
x <- structure(c(867L, 891L, 901L, 991L, 1085L, 1114L, 1117L, 1151L,
1174L, 1268L, 1384L, 1403L, 1550L, 1596L, 1608L), .Dim = c(15L, 1L),
index = structure(c(1280960887, 1280964672, 1280966285,
1280997350, 1281014882, 1281017687, 1281018106, 1281023184, 1281025529,
1281050369, 1281096942, 1281108126, 1281176749, 1281207496, 1281215744),
tzone = "", tclass = c("POSIXct", "POSIXt")), class = c("xts", "zoo"),
.indexCLASS = c("POSIXct", "POSIXt"), tclass = c("POSIXct", "POSIXt"),
.indexTZ = "", tzone = "")
# first count the number of observations each day
xd <- apply.daily(x, length)
# now sum the counts over a 2-day rolling window
x2d <- rollapply(xd, 2, sum)
# align times at the end of the period (if you want)
y <- align.time(x2d, n=60*60*24) # n is in seconds
这篇关于在不规则时间序列上滚动窗口的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!