dplyr:总结n个领先价值 [英] dplyr: Summing n leading values

查看:43
本文介绍了dplyr:总结n个领先价值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些这样的数据:

data <- tibble(a = 1:100)

a
--
1
2
3
4
5
6
7
...

等...

是否有任何优雅的方法来创建一个变量,该变量将是n个前导值的总和?我的意思是这样的:

Is there any elegant way to create a variable that would be a sum of n leading values? I mean something like this:

data %>% mutate(b = lead(a,1) + lead(a,2) + lead(a,3) + ... + lead(a,n))

例如,在n = 2的情况下,我会得到:

For example, in the case of n = 2 I would get:

a      b
--------------
1    2+3 = 5
2    3+4 = 7
3    4+5 = 9
4    5+6 = 11
5    6+7 = 13
6    7+8 = 15
7    8+9 = 17
...

提前谢谢!

推荐答案

我们接近危险地重新创建 dplyr 掩盖的 stats :: filter 函数:

We're getting dangerously close to recreating the stats::filter function which dplyr masks:

stats::filter(1:10, c(rep(1,2),0), sides=1)
#Time Series:
#Start = 1 
#End = 10 
#Frequency = 1 
# [1] NA NA  5  7  9 11 13 15 17 19

这里有一个小功能可以完全匹配输出:

Here's a little function to exactly match the output:

sumnahead <- function(x,n) {
  rev(stats::filter(rev(x), c(0,rep(1,n)), sides=1))
}

sumnahead(1:10,2)
#[1]  5  7  9 11 13 15 17 19 NA NA

它也很快,因为它可以移植到已编译的代码中:

It's also fast because it farms out to compiled code:

system.time(sumnahead(1:1e7,50))
#   user  system elapsed 
#   2.28    0.22    2.53 
system.time(lead_n(1:1e7,50))
#   user  system elapsed 
#   6.02    4.07   10.13 

这篇关于dplyr:总结n个领先价值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆