第一次出现一定数量后求和 [英] Summing rows after first occurance of a certain number
问题描述
我想在第一次出现某个数字后获得行的总和。在这种情况下,例如为 '10'
。
I would like to to get the sum of the rows after first occurrence of a certain number. In this case it is '10'
for instance.
我可以,但是如果我们可以知道第一次出现后的行号和该组的结束行号,我们可以在它们之间求和。
I though If we can know the row number after first occurrence and the ending row number of that group and we can sum in between them.
我可以得到每个组中第一个出现的'10',但是我不知道如何获得行的总和。
I can get the first occurrence of '10' each group but I don't know how can get the sum of the rows.
df <- data.frame(gr=rep(c(1,2),c(7,9)),
y_value=c(c(0,0,10,8,8,6,0),c(0,0,10,10,5,4,2,0,0)))
> df
gr y_value
1 1 0
2 1 0
3 1 10
4 1 8
5 1 8
6 1 6
7 1 0
8 2 0
9 2 0
10 2 10
11 2 10
12 2 5
13 2 4
14 2 2
15 2 0
16 2 0
我的姓名首字母尝试次数低于以下值,即使由于部分原因也无法使用:(!
My initial attempt is below which is not working for some reason even for grouping part:(!
library(dplyr)
df%>%
group_by(gr)%>%
mutate(check1=any(y_value==10),row_sum=which(y_value == 10)[1])
预期输出
> df
gr y_value sum_rows_range
1 1 0 22/4
2 1 0 22/4
3 1 10 22/4
4 1 8 22/4
5 1 8 22/4
6 1 6 22/4
7 1 0 22/4
8 2 0 21/6
9 2 0 21/6
10 2 10 21/6
11 2 10 21/6
12 2 5 21/6
13 2 4 21/6
14 2 2 21/6
15 2 0 21/6
16 2 0 21/6
推荐答案
A dplyr
解决方案:
library(dplyr)
df %>%
group_by(gr) %>%
slice(if(any(y_value == 10)) (which.max(y_value == 10)+1):n() else row_number()) %>%
summarize(sum = sum(y_value),
rows = n()) %>%
inner_join(df)
注意:
主要思想是<$ c $在前10个之后的行上显示c> slice 。 any(y_value == 10))
和 else row_number()
只是为了照顾有 y_value
中没有10。
The main idea is to slice
on the rows after the first 10 occurs. any(y_value == 10))
and else row_number()
are just to take care of the case where there are no 10's in y_value
.
阅读的文档吗? c $ c>,您会注意到,将其应用于逻辑矢量时,在这种情况下,
y_value == 10
,同时具有 FALSE
和 TRUE
值, which.min(x)
和哪个.max(x)
分别返回第一个 FALSE
或 TRUE
的索引, as FALSE< TRUE
。
Reading the documentation for ?which.max
, you will notice that when it is applied to a logical vector, in this case y_value == 10
, "with both FALSE
and TRUE
values, which.min(x)
and which.max(x)
return the index of the first FALSE
or TRUE
, respectively, as FALSE < TRUE
."
换句话说, which.max(y_value == 10)
将给出第一次出现的索引10。通过向其添加1,我可以从值右边开始 slice
在第一次出现10之后。
In other words, which.max(y_value == 10)
will give the index of the first occurrence of 10. By adding 1 to it, I can start slice
ing from the value right after the first occurrence of 10.
结果:
# A tibble: 16 × 4
gr sum rows y_value
<dbl> <dbl> <int> <dbl>
1 1 22 4 0
2 1 22 4 0
3 1 22 4 10
4 1 22 4 8
5 1 22 4 8
6 1 22 4 6
7 1 22 4 0
8 2 21 6 0
9 2 21 6 0
10 2 21 6 10
11 2 21 6 10
12 2 21 6 5
13 2 21 6 4
14 2 21 6 2
15 2 21 6 0
16 2 21 6 0
这篇关于第一次出现一定数量后求和的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!