R数据表滑动窗口 [英] R data.table sliding window
问题描述
使用data.table包实现滑动窗口函数的最好(最快)方法是什么?
我试图计算滚动中值,每个日期多行(由于2个额外的因素),我认为这意味着动物园rollapply函数不会工作。下面是一个使用naive for循环的例子:
library(data.table)
df< - data。框架(
id = 30000,
date = rep(as.IDate(as.IDate(2012-01-01)+ 0:29,origin =1970-01-01), each = 1000),
factor1 = rep(1:5,each = 200),
factor2 = 1:5,
value = rnorm(30,100,10)
)
dt = data.table(df)
setkeyv(dt,c(date,factor1,factor2))
get_window ; - function(date,factor1,factor2){
criteria< - data.table(
date = as.IDate((date-7)::( date-1),origin =1970- 01-01),
factor1 = as.integer(factor1),
factor2 = as.integer(factor2)
)
return(dt [criteria] )
}
输出< - data.table(unique(dt [,list(date,factor1,factor2)]))[,window_median:= as.numeric(NA)
for(i in nrow(output):1){
print(i)
output [i,window_median:= median(get_window(date,factor1,factor2)) ]
}
data.table
当前没有滚动窗口的任何特殊功能。在这里另一个类似问题的答案更详细这里:
滚动中位数很有趣。它需要一个专门的功能来有效地完成(与之前的注释相同的链接):
What is the best (fastest) way to implement a sliding window function with the data.table package? I'm trying to calculate a rolling median but have multiple rows per date (due to 2 additional factors), which I think means that the zoo rollapply function wouldn't work. Here is an example using a naive for loop:
Is there a fast way to run a rolling regression inside data.table? Rolling median is interesting. It would need a specialized function to do efficiently (same link as in earlier comment) : The 这篇关于R数据表滑动窗口的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
data.table
解决方案在这里的问题和答案都是非常低效的,相对于一个正确的专门 rollingmedian
函数(这是不可用的R afaik)。 / p> library(data.table)
df <- data.frame(
id=30000,
date=rep(as.IDate(as.IDate("2012-01-01")+0:29, origin="1970-01-01"), each=1000),
factor1=rep(1:5, each=200),
factor2=1:5,
value=rnorm(30, 100, 10)
)
dt = data.table(df)
setkeyv(dt, c("date", "factor1", "factor2"))
get_window <- function(date, factor1, factor2) {
criteria <- data.table(
date=as.IDate((date - 7):(date - 1), origin="1970-01-01"),
factor1=as.integer(factor1),
factor2=as.integer(factor2)
)
return(dt[criteria][, value])
}
output <- data.table(unique(dt[, list(date, factor1, factor2)]))[, window_median:=as.numeric(NA)]
for(i in nrow(output):1) {
print(i)
output[i, window_median:=median(get_window(date, factor1, factor2))]
}
data.table
doesn't have any special features for rolling windows, currently. Further detail here in my answer to another similar question here :data.table
solutions in the question and answers here are all very inefficient, relative to a proper specialized rollingmedian
function (which isn't available for R afaik).