遇到 0 时重置的累积和 [英] Cumulative sum that resets when 0 is encountered

查看:44
本文介绍了遇到 0 时重置的累积和的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想对一个字段进行累计求和,但只要遇到 0 就重置聚合值.

I would like to do a cumulative sum on a field but reset the aggregated value whenever a 0 is encountered.

这是我想要的一个例子:

Here is an example of what I want :

data.frame(campaign = letters[1:4] , 
       date=c("jan","feb","march","april"),
       b = c(1,0,1,1) ,
       whatiwant = c(1,0,1,2)
       )

 campaign  date b whatiwant
1        a   jan 1         1
2        b   feb 0         0
3        c march 1         1
4        d april 1         2

推荐答案

另一个基础就是

with(df, ave(b, cumsum(b == 0), FUN = cumsum))
## [1] 1 0 1 2

这只会根据 0 出现将列 b 划分为组,并计算每个组的 b 的累积总和

This will just divide column b to groups according to 0 appearances and compute the cumulative sum of b per these groups

使用最新 data.table 版本 (v 1.9.6+) 的另一种解决方案

Another solution using the latest data.table version (v 1.9.6+)

library(data.table) ## v 1.9.6+
setDT(df)[, whatiwant := cumsum(b), by = rleid(b == 0L)]
#    campaign  date b whatiwant
# 1:        a   jan 1         1
# 2:        b   feb 0         0
# 3:        c march 1         1
# 4:        d april 1         2

<小时>

每条评论的一些基准


Some benchmarks per comments

set.seed(123)
x <- sample(0:1e3, 1e7, replace = TRUE)
system.time(res1 <- ave(x, cumsum(x == 0), FUN = cumsum))
# user  system elapsed 
# 1.54    0.24    1.81 
system.time(res2 <- Reduce(function(x, y) if (y == 0) 0 else x+y, x, accumulate=TRUE))
# user  system elapsed 
# 33.94    0.39   34.85 
library(data.table)
system.time(res3 <- data.table(x)[, whatiwant := cumsum(x), by = rleid(x == 0L)])
# user  system elapsed 
# 0.20    0.00    0.21 

identical(res1, as.integer(res2))
## [1] TRUE
identical(res1, res3$whatiwant)
## [1] TRUE

这篇关于遇到 0 时重置的累积和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆