在R data.table中运行总和 [英] Running Sum in R data.table

查看:102
本文介绍了在R data.table中运行总和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在R中有一个data.table,我想按组对其应用滚动总和。但是问题是组长度不一样,并且rollapply函数到达较短的组时,它将遇到错误。除了for循环外,还有什么方法可以解决这个问题?

I have a data.table in R, and I would like to apply rolling sum by group to it. But the problem is the group length is not the same and when the rollapply function reaches the shorter group, it will encounter an error. Is there a way to solve this, except for-loops?

下面是一个简单的示例来说明问题。

The following is a simple example to illustrate the problem.

DT <- data.table(id = c(rep("A", 6), rep("B", 2), rep("C", 8)),
                 val = c(1:6, 1:2, 1:8))
> DT
    id val
 1:  A   1
 2:  A   2
 3:  A   3
 4:  A   4
 5:  A   5
 6:  A   6
 7:  B   1
 8:  B   2
 9:  B   1
10:  B   2
11:  B   3
12:  B   4
13:  B   5
14:  B   6
15:  C   7
16:  C   8

使用 rollapplyr()

DT[, cum.sum := rollapplyr(val, width = 4, FUN = sum, fill = NA), by = id]

但这会给我一个错误

Error in seq.default(start.at, NROW(data), by = by) : wrong sign in 'by' argument


$错误b $ b

输出为

And the output is

> DT
    id val cum.sum
 1:  A   1      NA
 2:  A   2      NA
 3:  A   3      NA
 4:  A   4      10
 5:  A   5      14
 6:  A   6      18
 7:  B   1      NA
 8:  B   2      NA
 9:  C   1      NA
10:  C   2      NA
11:  C   3      NA
12:  C   4      NA
13:  C   5      NA
14:  C   6      NA
15:  C   7      NA
16:  C   8      NA

理想情况下,输出应为

> DT
    id val cum.sum
 1:  A   1      NA
 2:  A   2      NA
 3:  A   3      NA
 4:  A   4      10
 5:  A   5      14
 6:  A   6      18
 7:  B   1      NA
 8:  B   2      NA
 9:  C   1      NA
10:  C   2      NA
11:  C   3      NA
12:  C   4      10
13:  C   5      14
14:  C   6      18
15:  C   7      22
16:  C   8      26


推荐答案

我们可以做到

DT[, cum.sum := Reduce(`+`, shift(val, 0:3)), by=id]

    id val cum.sum
 1:  A   1      NA
 2:  A   2      NA
 3:  A   3      NA
 4:  A   4      10
 5:  A   5      14
 6:  A   6      18
 7:  B   1      NA
 8:  B   2      NA
 9:  C   1      NA
10:  C   2      NA
11:  C   3      NA
12:  C   4      10
13:  C   5      14
14:  C   6      18
15:  C   7      22
16:  C   8      26






我知道我以前在某处见过-可能重复吗?

这篇关于在R data.table中运行总和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆