R data.table 滑动窗口 [英] R data.table sliding window
问题描述
使用 data.table 包实现滑动窗口功能的最佳(最快)方法是什么?
What is the best (fastest) way to implement a sliding window function with the data.table package?
我正在尝试计算滚动中位数,但每个日期有多个行(由于 2 个附加因素),我认为这意味着 zoo rollapply 函数不起作用.这是一个使用简单 for 循环的示例:
I'm trying to calculate a rolling median but have multiple rows per date (due to 2 additional factors), which I think means that the zoo rollapply function wouldn't work. Here is an example using a naive for loop:
library(data.table)
df <- data.frame(
id=30000,
date=rep(as.IDate(as.IDate("2012-01-01")+0:29, origin="1970-01-01"), each=1000),
factor1=rep(1:5, each=200),
factor2=1:5,
value=rnorm(30, 100, 10)
)
dt = data.table(df)
setkeyv(dt, c("date", "factor1", "factor2"))
get_window <- function(date, factor1, factor2) {
criteria <- data.table(
date=as.IDate((date - 7):(date - 1), origin="1970-01-01"),
factor1=as.integer(factor1),
factor2=as.integer(factor2)
)
return(dt[criteria][, value])
}
output <- data.table(unique(dt[, list(date, factor1, factor2)]))[, window_median:=as.numeric(NA)]
for(i in nrow(output):1) {
print(i)
output[i, window_median:=median(get_window(date, factor1, factor2))]
}
推荐答案
data.table
目前没有任何滚动窗口的特殊功能.在我对另一个类似问题的回答中进一步详细说明:
data.table
doesn't have any special features for rolling windows, currently. Further detail here in my answer to another similar question here :
是否有一种快速的方法可以在 data.table 中运行滚动回归?
Is there a fast way to run a rolling regression inside data.table?
滚动中位数很有趣.它需要一个专门的功能才能有效地执行(与之前评论中的链接相同):
Rolling median is interesting. It would need a specialized function to do efficiently (same link as in earlier comment) :
这里的问题和答案中的 data.table
解决方案都非常低效,相对于适当的专用 rollingmedian
函数(不适用于 R afaik).
The data.table
solutions in the question and answers here are all very inefficient, relative to a proper specialized rollingmedian
function (which isn't available for R afaik).
这篇关于R data.table 滑动窗口的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!