在data.table R中按组滚动 [英] Rolling by group in data.table R

查看:81
本文介绍了在data.table R中按组滚动的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图按组逐个遍历data.table并遇到问题。不知道应该更改功能还是打错电话。这是简单的示例:

I'm trying to roll my function through data.table by group and run into problems. Not sure should I change my function or is my call wrong. Here is simple example:

数据

 test <- data.table(return=c(0.1, 0.1, 0.1, 0.1, 0.1, 0.2, 0.2, 0.2, 0.2, 0.2),
                   sec=c("A", "A", "A", "A", "A", "B", "B", "B", "B", "B"))

我的函数

zoo_fun <- function(dt, N) {
  (rollapply(dt$return + 1, N, FUN=prod, fill=NA, align='right') - 1)
}

运行它(我想创建新的列动量,它只是每个证券为一个添加的最新3个观察值的乘积(因此,按= sec分组)。

Running it (I want to create new column momentum, which would be just product of latest 3 observations added by one for each security (so grouping by=sec).

test[, momentum3 := zoo_fun(test, 3), by=sec]

    Warning messages:
    1: In `[.data.table`(test, , `:=`(momentum3, zoo_fun(test, 3)), by = sec) :
      RHS 1 is length 10 (greater than the size (5) of group 1). The last 5 element(s) will be discarded.
    2: In `[.data.table`(test, , `:=`(momentum3, zoo_fun(test, 3)), by = sec) :
      RHS 1 is length 10 (greater than the size (5) of group 2). The last 5 element(s) will be discarded.

我收到警告,但预期结果不正确:

I get that warning and result is not expected:

> test
    return sec momentum3
 1:    0.1   A        NA
 2:    0.1   A        NA
 3:    0.1   A     0.331
 4:    0.1   A     0.331
 5:    0.1   A     0.331
 6:    0.2   B        NA
 7:    0.2   B        NA
 8:    0.2   B     0.331
 9:    0.2   B     0.331
10:    0.2   B     0.331

我期望B秒用0.728((1.2 * 1.2 * 1.2)-1)填充,其中两个NA开始。我究竟做错了什么?

I was expecting B sec to be filled with 0.728 ((1.2*1.2*1.2) -1) with two NAs in start. What am I doing wrong? Is it that rolling functions won't work with grouping?

推荐答案

使用 dt $ return 整个 data.table 在组内部进行选择。只需在函数定义中使用所需的列即可,它会正常工作:

When you use dt$return the whole data.table is picked internally within the groups. Just use the column you need in the function definition and it will work fine:

#use the column instead of the data.table
zoo_fun <- function(column, N) {
  (rollapply(column + 1, N, FUN=prod, fill=NA, align='right') - 1)
}

#now it works fine
test[, momentum := zoo_fun(return, 3), by = sec]

作为单独的注释,您可能不应该使用 return 作为列或变量名。

As a separate note, you should probably not use return as a column or variable name.

出局:

> test
    return sec momentum
 1:    0.1   A       NA
 2:    0.1   A       NA
 3:    0.1   A    0.331
 4:    0.1   A    0.331
 5:    0.1   A    0.331
 6:    0.2   B       NA
 7:    0.2   B       NA
 8:    0.2   B    0.728
 9:    0.2   B    0.728
10:    0.2   B    0.728

这篇关于在data.table R中按组滚动的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆