不规则时间序列上的滚动窗口 [英] Rolling window over irregular time series

查看:22
本文介绍了不规则时间序列上的滚动窗口的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个使用 xts 的不规则时间序列的事件(帖子),我想计算在滚动的每周窗口(或每两周一次,或 3 天等)内发生的事件数.数据如下所示:

I have an irregular time series of events (posts) using xts, and I want to calculate the number of events that occur over a rolling weekly window (or biweekly, or 3 day, etc). The data looks like this:

                    postid
2010-08-04 22:28:07    867
2010-08-04 23:31:12    891
2010-08-04 23:58:05    901
2010-08-05 08:35:50    991
2010-08-05 13:28:02   1085
2010-08-05 14:14:47   1114
2010-08-05 14:21:46   1117
2010-08-05 15:46:24   1151
2010-08-05 16:25:29   1174
2010-08-05 23:19:29   1268
2010-08-06 12:15:42   1384
2010-08-06 15:22:06   1403
2010-08-07 10:25:49   1550
2010-08-07 18:58:16   1596
2010-08-07 21:15:44   1608

应该产生类似的东西

                    nposts
2010-08-05 00:00:00     10
2010-08-06 00:00:00      9
2010-08-07 00:00:00      5

为 2 天的窗口.我从 PerformanceAnalytics 等中研究了 rollapplyapply.rolling 等,它们都假设有规律的时间序列数据.我尝试将所有时间都更改为帖子发生的那一天,并使用 ddply 之类的东西在每一天进行分组,这让我很接近.但是,用户可能不会每天都发帖,所以时间序列仍然是不规则的.我可以用 0 来填补空白,但这可能会使我的数据膨胀很多,而且它已经相当大了.

for a 2-day window. I have looked into rollapply, apply.rolling from PerformanceAnalytics, etc, and they all assume regular time series data. I tried changing all of the times to just the day the the post occurred and using something like ddply to group on each day, which gets me close. However, a user might not post every day, so the time series will still be irregular. I could fill in the gaps with 0s, but that might inflate my data a lot and it's already quite large.

我该怎么办?

推荐答案

这里有一个使用 xts 的解决方案:

Here's a solution using xts:

x <- structure(c(867L, 891L, 901L, 991L, 1085L, 1114L, 1117L, 1151L, 
  1174L, 1268L, 1384L, 1403L, 1550L, 1596L, 1608L), .Dim = c(15L, 1L),
  index = structure(c(1280960887, 1280964672, 1280966285, 
  1280997350, 1281014882, 1281017687, 1281018106, 1281023184, 1281025529, 
  1281050369, 1281096942, 1281108126, 1281176749, 1281207496, 1281215744),
  tzone = "", tclass = c("POSIXct", "POSIXt")), class = c("xts", "zoo"),
  .indexCLASS = c("POSIXct", "POSIXt"), tclass = c("POSIXct", "POSIXt"),
  .indexTZ = "", tzone = "")
# first count the number of observations each day
xd <- apply.daily(x, length)
# now sum the counts over a 2-day rolling window
x2d <- rollapply(xd, 2, sum)
# align times at the end of the period (if you want)
y <- align.time(x2d, n=60*60*24)  # n is in seconds

这篇关于不规则时间序列上的滚动窗口的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆