使用group_by过滤特定情况,同时保持NAs [英] Use group_by to filter specific cases while keeping NAs
问题描述
我想过滤我的数据集,以便将具有观察值的案例保留在特定列中。说明:
I want to filter my dataset to keep cases with observations in a specific column. To illustrate:
help <- data.frame(deid = c(5, 5, 5, 5, 5, 12, 12, 12, 12, 17, 17, 17),
score.a = c(NA, 1, 1, 1, NA, NA, NA, NA, NA, NA, 1, NA))
创建
deid score.a
1 5 NA
2 5 1
3 5 1
4 5 1
5 5 NA
6 12 NA
7 12 NA
8 12 NA
9 12 NA
10 17 NA
11 17 1
12 17 NA
我想告诉dplyr保留在 score.a
中有任何意见的案例,包括NA值。因此,我希望它返回:
And I want to tell dplyr to keep cases that have any observations in score.a
, including the NA values. Thus, I want it to return:
deid score.a
1 5 NA
2 5 1
3 5 1
4 5 1
5 5 NA
6 17 NA
7 17 1
8 17 NA
我运行代码 help%>%group_by(deid)%>%filter(score.a> ; 0)
然而它也拉出了NAs。感谢您的协助。
I ran the code help %>% group_by(deid) %>% filter(score.a > 0)
however it pulls out the NAs as well. Thank you for any assistance.
编辑:在这里有一个类似的问题如何使用dplyr :: filter()删除观察组
但是,在答案中他们使用'all'条件,这需要使用'any'条件。
A similar question was asked here How to remove groups of observation with dplyr::filter() However, in the answer they use the 'all' condition and this requires use of the 'any' condition.
推荐答案
尝试
library(dplyr)
help %>%
group_by(deid) %>%
filter(any(score.a >0 & !is.na(score.a)))
# deid score.a
#1 5 NA
#2 5 1
#3 5 1
#4 5 1
#5 5 NA
#6 17 NA
#7 17 1
#8 17 NA
或类似的方法与 data.table
library(data.table)
setDT(help)[, if(any(score.a>0 & !is.na(score.a))) .SD , deid]
# deid score.a
#1: 5 NA
#2: 5 1
#3: 5 1
#4: 5 1
#5: 5 NA
#6: 17 NA
#7: 17 1
#8: 17 NA
如果条件是将'deid'与'score.a'> 0中的所有值进行子集',那么上述代码可以修改为
If the condition is to subset 'deid's with all the values in 'score.a' > 0, then the above code can be modified to,
setDT(help)[, if(!all(is.na(score.a)) &
all(score.a[!is.na(score.a)]>0)) .SD , deid]
# deid score.a
#1: 5 NA
#2: 5 1
#3: 5 1
#4: 5 1
#5: 5 NA
#6: 17 NA
#7: 17 1
#8: 17 NA
假设其中一个score.a 'deid'组小于0,
Suppose one of the 'score.a' in 'deid' group is less than 0,
help$score.a[3] <- -1
上述代码将返回
setDT(help)[, if(!all(is.na(score.a)) &
all(score.a[!is.na(score.a)]>0, deid],
# deid score.a
#1: 17 NA
#2: 17 1
#3: 17 NA
这篇关于使用group_by过滤特定情况,同时保持NAs的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!