R中的动态分组|根据应用功能的条件进行分组 [英] Dynamic Grouping in R | Grouping based on condition on applied function
问题描述
在R中的gregation()函数中,如何在对变量的应用函数进行分组时指定停止条件?
例如,我有这样的数据框架: df
输入数据框架
For example, I have data-frame like this: "df" Input Data frame
注意:假设输入数据框中的每一行表示该比赛中一名球员打的单个球。因此,通过计数行数可以告诉我们所需的球数。
Note: Assuming each row in input data frame is denoting single ball played by a player in that match. So, by counting a number of rows can tell us the number of balls required.
而且,我想要这样的数据框:输出数据帧
我需要的是:得分10个奔跑需要多少个球?
And, I want my data frame like this one: Output data frame My need is: How many balls are required to score 10 runs?
当前,我正在使用以下R代码:
group_data<-聚合(df $ score,by = list(Category = df $ player, df $ match),FUN = sum,na.rm = TRUE)
Currently, I am using this R code:
group_data <- aggregate(df$score, by=list(Category=df$player,df$match), FUN=sum,na.rm = TRUE)
使用此代码,我无法停止想要的分组,当它对所有行进行分组时,它将停止。我不希望所有行都考虑在内。
Using this code, I can not stop grouping as I want, it stops when it groups all rows. I don't want all rows to consider.
但是如何设置约束,例如得分> = 10时立即停止分组
通过施加此约束,我唯一的目的是计算满足此条件的行数。
But How to put constraint like "Stop grouping as soon as score >= 10" By putting this constraint, my sole purpose is to count the number of rows satisfying this condition.
预先感谢。
推荐答案
以下是使用 dplyr
library(dplyr)
df1 %>%
group_by(match, player) %>%
filter(!lag(cumsum(score) > 10, default = FALSE)) %>%
summarise(score = sum(score), Count = n())
# A tibble: 2 x 4
# Groups: match [?]
# match player score Count
# <int> <int> <dbl> <int>
#1 1 30 12 2
#2 2 31 15 3
数据
data
df1 <- structure(list(match = c(1L, 1L, 1L, 2L, 2L, 2L), player = c(30L,
30L, 30L, 31L, 31L, 31L), score = c(6, 6, 6, 3, 6, 6)), .Names = c("match",
"player", "score"), row.names = c(NA, -6L), class = "data.frame")
这篇关于R中的动态分组|根据应用功能的条件进行分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!