dplyr:总结n个领先价值 [英] dplyr: Summing n leading values
本文介绍了dplyr:总结n个领先价值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
我有一些这样的数据:
data <- tibble(a = 1:100)
a
--
1
2
3
4
5
6
7
...
等...
是否有任何优雅的方法来创建一个变量,该变量将是n个前导值的总和?我的意思是这样的:
Is there any elegant way to create a variable that would be a sum of n leading values? I mean something like this:
data %>% mutate(b = lead(a,1) + lead(a,2) + lead(a,3) + ... + lead(a,n))
例如,在n = 2的情况下,我会得到:
For example, in the case of n = 2 I would get:
a b
--------------
1 2+3 = 5
2 3+4 = 7
3 4+5 = 9
4 5+6 = 11
5 6+7 = 13
6 7+8 = 15
7 8+9 = 17
...
提前谢谢!
推荐答案
我们接近危险地重新创建 dplyr
掩盖的 stats :: filter
函数:
We're getting dangerously close to recreating the stats::filter
function which dplyr
masks:
stats::filter(1:10, c(rep(1,2),0), sides=1)
#Time Series:
#Start = 1
#End = 10
#Frequency = 1
# [1] NA NA 5 7 9 11 13 15 17 19
这里有一个小功能可以完全匹配输出:
Here's a little function to exactly match the output:
sumnahead <- function(x,n) {
rev(stats::filter(rev(x), c(0,rep(1,n)), sides=1))
}
sumnahead(1:10,2)
#[1] 5 7 9 11 13 15 17 19 NA NA
它也很快,因为它可以移植到已编译的代码中:
It's also fast because it farms out to compiled code:
system.time(sumnahead(1:1e7,50))
# user system elapsed
# 2.28 0.22 2.53
system.time(lead_n(1:1e7,50))
# user system elapsed
# 6.02 4.07 10.13
这篇关于dplyr:总结n个领先价值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文