对具有dplyr和rle条件的组中的连续值进行计数 [英] Count consecutive values in groups with condition with dplyr and rle
问题描述
我的问题与下面提出的问题非常相似,但是我想添加一条附加命令以仅返回序列具有两个以上连续值的情况。
My question is very similar to the one posed below, however I want to add an additional command to return only cases when a sequence has more than 2 consecutive values.
当给定的序列运行在给定的时代内有两个以上连续的数字时,如何计算连续的成功(即$ consec中为1)的数目,并且给定的年份?
How do I count the number of consecutive "success" (i.e. 1 in $consec) when a given sequence run has more than 2 consecutive numbers, within a given Era and a given Year?
类似的问题:总结dplyr和rle的连续失败
。为了进行比较,我修改了该问题中使用的示例:
Similar question to: Summarize consecutive failures with dplyr and rle . For comparison, I've modified the example used in that question:
library(dplyr)
df <- data.frame(Era=c(1,1,1,1,1,1,1,1,1,1),Year = c(1,2,2,3,3,3,3,3,3,3), consec = c(0,0,1,0,1,1,0,1,1,1))
df %>%
group_by(Era,Year) %>%
do({tmp <- with(rle(.$consec==1), lengths[values])
data.frame(Year= .$Year, Count=(length(tmp)))}) %>%
slice(1L)
> Source: local data frame [3 x 3]
> Groups: Era, Year
> Era Year Count
> 1 1 1 0
> 2 1 2 1
> 3 1 3 2
>
我现在所要做的就是添加一个条件,以仅包含连续数字的序列> 2。所需的结果:
All I need now is to add a condition to include only cases of consecutive numbers in a sequence of >2. Desired result:
> Source: local data frame [3 x 3]
> Groups: Era, Year
> Era Year Count
> 1 1 1 0
> 2 1 2 0
> 3 1 3 1
任何建议将不胜感激。
推荐答案
我们需要创建长度为
的逻辑索引,并获取 sum $的c $ c>
We need to create a logical index with lengths
and get the sum
of it
df %>%
group_by(Era, Year) %>%
do({ tmp <- with(rle(.$consec), sum(lengths > 2))
data.frame(Count = tmp)})
# Era Year Count
# <dbl> <dbl> <int>
#1 1 1 0
#2 1 2 0
#3 1 3 1
这篇关于对具有dplyr和rle条件的组中的连续值进行计数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!