R如何按行值分组,拆分或子集 [英] R How to group_by, split or subset by row values
本文介绍了R如何按行值分组,拆分或子集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
这是上一个问题 R(如何按行分组)的继续价值?拆分?
输入数据框的更改为
id = str_c("x",1:22)
val = c(rep("NO1", 2), "START", rep("yes1", 2), "STOP", "NO",
"START","NO1", "START", rep("yes2", 3), "STOP", "NO1",
"START", rep("NO3",3), "STOP", "NO1", "STOP")
data = data.frame(id,val)
预期输出是具有 val 列的数据帧,如下所示-
Expected output is dataframe with val column as follows-
val = c("START", rep("yes1", 2), "STOP",
"START","NO1", "START", rep("yes2", 3), "STOP",
"START", rep("NO3",3), "STOP", "NO1", "STOP")
推荐答案
简单来说,如果我们删除所有其他既不是START也不是STOP的条目,则START是一个有效的起点(如果它是第一个START或在停止之前;同样,如果STOP是最后一个STOP或START后面的STOP,则它是有效的端点.考虑一下此功能:
Simply speaking, if we remove all the other entries that are neither START nor STOP, then, a START is a valid start point if it is the first START or preceded by a STOP; similarly, a STOP is a valid endpoint if it is the last STOP or succeeded by a START. Consider this function:
valid_anchors <- function(x) {
are_anchors <- x %in% c("START", "STOP")
id <- seq_along(x)[are_anchors]
x <- x[are_anchors]
start_pos <- which(x == "START" & c("", head(x, -1L)) %in% c("", "STOP"))
stop_pos <- which(x == "STOP" & c(tail(x, -1L), "") %in% c("", "START"))
list(id[start_pos], id[stop_pos])
}
然后只需应用您在上一篇文章中获得的相同功能
Then just apply the same function you got in your last post
ind <- valid_anchors(data$val)
data[sort(unique(unlist(mapply(`:`, ind[[1]], ind[[2]])))), ]
输出
id val
3 x3 START
4 x4 yes1
5 x5 yes1
6 x6 STOP
8 x8 START
9 x9 NO1
10 x10 START
11 x11 yes2
12 x12 yes2
13 x13 yes2
14 x14 STOP
16 x16 START
17 x17 NO3
18 x18 NO3
19 x19 NO3
20 x20 STOP
21 x21 NO1
22 x22 STOP
这篇关于R如何按行值分组,拆分或子集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文