按组为 FALSE 和 NA 之间的 TRUE 运行创建计数器 [英] Create counter for runs of TRUE among FALSE and NA, by group
问题描述
我有点想破解.
我有一个 data.frame
,其中 TRUE
的运行由一个或多个 FALSE
或 NA的运行分隔代码>:
I have a data.frame
where runs of TRUE
are separated by runs of one or more FALSE
or NA
:
group criterium
1 A NA
2 A TRUE
3 A TRUE
4 A TRUE
5 A FALSE
6 A FALSE
7 A TRUE
8 A TRUE
9 A FALSE
10 A TRUE
11 A TRUE
12 A TRUE
13 B NA
14 B FALSE
15 B TRUE
16 B TRUE
17 B TRUE
18 B FALSE
structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 1L, 1L,
1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L), .Label = c("A",
"B"), class = "factor"), criterium = c(NA, TRUE, TRUE, TRUE,
FALSE, FALSE, TRUE, TRUE, FALSE, TRUE, TRUE, TRUE, NA, FALSE,
TRUE, TRUE, TRUE, FALSE)), class = "data.frame", row.names = c(NA,
-18L))
我想将criterium
列中TRUE
的组按升序排列,同时忽略FALSE
和NA
>.目标是在每个 group
中为每次 TRUE
运行获得一个唯一的、连续的 ID.
I want to rank the groups of TRUE
in column criterium
in ascending order while disregarding the FALSE
and NA
. The goal is to have a unique, consecutive ID for each run of TRUE
, within each group
.
所以结果应该是这样的:
So the result should look like:
group criterium goal
1 A NA NA
2 A TRUE 1
3 A TRUE 1
4 A TRUE 1
5 A FALSE NA
6 A FALSE NA
7 A TRUE 2
8 A TRUE 2
9 A FALSE NA
10 A TRUE 3
11 A TRUE 3
12 A TRUE 3
13 B NA NA
14 B FALSE NA
15 B TRUE 1
16 B TRUE 1
17 B TRUE 1
18 B FALSE NA
我确定有一种相对简单的方法可以做到这一点,我只是想不出一个方法.我尝试了 dense_rank()
和 dplyr
的其他窗口函数,但无济于事.
I'm sure there is a relatively easy way to do this, I just can't think of one. I experimented with dense_rank()
and other window functions of dplyr
, but to no avail.
推荐答案
另一种 data.table
方法:
library(data.table)
setDT(dt)
dt[, cr := rleid(criterium)][
(criterium), goal := rleid(cr), by=.(group)]
这篇关于按组为 FALSE 和 NA 之间的 TRUE 运行创建计数器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!