根据累计和和组创建新组 [英] Create new group based on cumulative sum and group
问题描述
我希望根据两个条件创建一个新组。我希望所有情况下,直到价值的累加总和达到10为止,并希望每个人都做到这一点。我设法使其分别适用于每个条件,但不能同时使用for循环和dplyr。但是,我需要同时应用这两个条件。以下是我希望数据显示的样子(我不需要RunningSum_Value列,但是为了澄清起见,我保留了它)。理想情况下,我需要dplyr解决方案,但我不挑剔。
I am looking to create a new group based on two conditions. I want all of the cases until the cumulative sum of Value reaches 10 to be grouped together and I want this done within each person. I have managed to get it to work for each of the conditions separately, but not together using for loops and dplyr. However, I need both of these conditions to be applied. Below is what I would like the data to look like (I don't need an RunningSum_Value column, but I kept it in for clarification). Ideally I would like a dplyr solution, but I m not picky. Thank you in advance!
ID Value RunningSum_Value Group
PersonA 1 1 1
PersonA 3 4 1
PersonA 10 14 1
PersonA 3 3 2
PersonB 11 11 3
PersonB 12 12 4
PersonC 3 3 5
PersonD 4 4 6
PersonD 9 13 6
PersonD 5 5 7
PersonD 11 16 7
PersonD 6 6 8
PersonD 1 7 8
这是我的数据:
df <- read.table(text="ID Value
PersonA 1
PersonA 3
PersonA 10
PersonA 3
PersonB 11
PersonB 12
PersonC 3
PersonD 4
PersonD 9
PersonD 5
PersonD 11
PersonD 6
PersonD 1", header=TRUE,stringsAsFactors=FALSE)
推荐答案
定义函数 sum0
sum
在其参数上为 sum
,但每次达到10或更多时,其输出为0。定义函数 is_start
返回TRUE为组的起始位置,否则为FALSE。最后使用 ave
将 is_start
应用于每个ID组,然后执行 cumsum $ c
Define function sum0
which does a sum
on its argument except that each time it gets to 10 or more it outputs 0. Define function is_start
that returns TRUE for the start position of a group and FALSE otherwise. Finally apply is_start
to each ID group using ave
and then perform a cumsum
on that to get the group numbers.
sum0 <- function(x, y) { if (x + y >= 10) 0 else x + y }
is_start <- function(x) head(c(TRUE, Reduce(sum0, init=0, x, acc = TRUE)[-1] == 0), -1)
cumsum(ave(DF$Value, DF$ID, FUN = is_start))
## [1] 1 1 1 2 3 4 5 6 6 7 7 8 8
更新:修复
这篇关于根据累计和和组创建新组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!