如何根据NA与字母对值进行分组 [英] How to group values based on NA vs. alphabet
问题描述
我在按字母顺序排列的数据框中有一列字母
值的列,部分插入了 NA
: >
I have a column of LETTER
values in alphabetic order in a dataframe partly interspersed with NA
:
df1 <- data.frame(
phase = c(NA, "A", "B", "D", NA, "A", "B", "C", "E", "A", "B", "D")
)
字母
值形成组: A $ c $中的任何内容c>直到 下一个
NA
或下一个 A
是一个群组。我想创建一个新列以使这些组明确。
The LETTER
values form groups: anything from A
until either the next NA
or the next A
is a group. I'd like to create a new column to make these groups explicit.
预期结果是:
df1 <- data.frame(
phase = c(NA, "A", "B", "D", NA, "A", "B", "C", "E", "A", "B", "D"),
group = c(NA,"group1","group1","group1",NA, "group2","group2","group2","group2","group3","group3","group3")
)
如何创建此列?对于基于 dplyr
的其他建议,我深表感谢。
How can I create this column? I'm grateful for any advice, dplyr
-based or otherwise.
到目前为止,我已经尝试过-仅部分成功(第三组与第二组之间没有用 NA
隔开):
What I've tried so far--with only partial success (the third group, which is not separated from the second by NA
, is missed):
df1 %>%
mutate(group = cumsum(is.na(phase)),
group = ifelse(is.na(phase), NA, paste("group", group, sep = "")))
phase group
1 <NA> <NA>
2 A group1
3 B group1
4 D group1
5 <NA> <NA>
6 A group2
7 B group2
8 C group2
9 E group2
10 A group2
11 B group2
12 D group2
推荐答案
如果阶段为 A ;
,跳到下一组。然后,当阶段
为 NA
时,将这些组替换为 NA
。
If phase is "A"
, jump to the next group. Then replace those groups with NA
when phase
is NA
.
library(dplyr)
df1 %>%
mutate(group = cumsum(phase == "A" & !is.na(phase)) %>%
paste0("group", .) %>%
replace(is.na(phase), NA))
# phase group
# 1 <NA> <NA>
# 2 A group1
# 3 B group1
# 4 D group1
# 5 <NA> <NA>
# 6 A group2
# 7 B group2
# 8 C group2
# 9 E group2
# 10 A group3
# 11 B group3
# 12 D group3
这篇关于如何根据NA与字母对值进行分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!