使用自适应窗口长度在data.table中计算滚动平均值 [英] Computing rolling mean in data.table with adaptive window lengths

查看:81
本文介绍了使用自适应窗口长度在data.table中计算滚动平均值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我希望在具有自适应窗口的data.table中按组计算移动平均值,以便在时间序列开始时没有NA.我知道如何使用frollmean并设置adaptive = TRUE(例如,请参见

I am looking to compute a moving average by group in a data.table with an adaptive window so that there are no NAs at the beginning of the time series. I know how to do this with frollmean and setting adaptive = TRUE (see for instance jangorecki's response in this thread). I can get the same code to work when all groups in my data.table are of the same length but run into errors when the groups are of different size.

例如,如果我的数据是

tmp = data.table(Gp = c(rep('A',6),rep('B',4)), Val = c(1,3,4,6,2,2,8,5,7,10))

我正在做长度为3的移动平均线,那么所需的响应为

and I am doing a moving average of length 3, then the desired response is

> desired_output
    Gp  Val
 1:  A 1.00
 2:  A 2.00
 3:  A 2.67
 4:  A 4.33
 5:  A 4.00
 6:  A 3.33
 7:  B 8.00
 8:  B 6.50
 9:  B 6.67
10:  B 7.33

我尝试了以下操作:

mov_window_len = vector("list",2)
mov_window_len[[1]] = c(1,2,rep(3,4))
mov_window_len[[2]] = c(1,2,rep(3,2))
tmp[,lapply(.SD, frollmean, n = mov_window_len, align = "right", adaptive = TRUE), by = Gp]

但是我收到一个错误消息,说作为"n"个参数的列表提供的整数矢量的长度必须等于"x"中提供的观测值的数量

but I get an error saying length of integer vector(s) provided as list to 'n' argument must be equal to number of observations provided in 'x'

任何解决此问题的帮助将不胜感激.预先感谢.

Any help in resolving this will be much appreciated. Thanks in advance.

推荐答案

您可以使用组索引 .GRP 来子集 mov_window_len .这将为您提供适合每个组的长度.您只想获取 Val frollmean ,因此不需要 lapply .

You can use the group index .GRP to subset mov_window_len. This will give you the right lengths for each group. You only want to take frollmean of Val, so no need for lapply.

tmp[, frollmean(Val, n = mov_window_len[.GRP], align = "right", adaptive = TRUE), by = Gp]

#     Gp       V1
#  1:  A 1.000000
#  2:  A 2.000000
#  3:  A 2.666667
#  4:  A 4.333333
#  5:  A 4.000000
#  6:  A 3.333333
#  7:  B 8.000000
#  8:  B 6.500000
#  9:  B 6.666667
# 10:  B 7.333333

或者,可以将窗口长度添加到输入data.table(下面的 Len 字段),因为它对应于每一行.

Alternatively window length can be added to input data.table (Len field below), as it corresponds to each row.

tmp[Gp=="A", Len:=mov_window_len[[1]]
    ][Gp=="B", Len:=mov_window_len[[2]]
     ][, .(Val, Len, RollVal=frollmean(Val, Len, adaptive=TRUE)), by=Gp]
#    Gp Val Len  RollVal
# 1:  A   1   1 1.000000
# 2:  A   3   2 2.000000
# 3:  A   4   3 2.666667
# 4:  A   6   3 4.333333
# 5:  A   2   3 4.000000
# 6:  A   2   3 3.333333
# 7:  B   8   1 8.000000
# 8:  B   5   2 6.500000
# 9:  B   7   3 6.666667
#10:  B  10   3 7.333333

这篇关于使用自适应窗口长度在data.table中计算滚动平均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆