在dplyr中基于NA进行过滤 [英] Filter based on NA in dplyr
本文介绍了在dplyr中基于NA进行过滤的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!
问题描述
这是我的df
df <- structure(structure(list(group = structure(c(1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 3L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L, 5L), .Label = c("A", "B", "C", "D", "E"), class = "factor"), y = c(NA, NA, NA, NA, 1, NA, NA, NA, 1, 2, NA, NA, 1, 2, 3, NA, 2, 2, 3, 4, NA, 3, 3, 4, 5), x = c(1L, 2L, 3L, 4L,5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L, 1L, 2L, 3L, 4L, 5L)), .Names = c("group", "y", "x"), row.names = c(NA, 25L), class = "data.frame"))
> df
group y x
1 A NA 1
2 A NA 2
3 A NA 3
4 A NA 4
5 A 1 5
6 B NA 1
7 B NA 2
8 B NA 3
9 B 1 4
10 B 2 5
11 C NA 1
12 C NA 2
13 C 1 3
14 C 2 4
15 C 3 5
16 D NA 1
17 D 2 2
18 D 2 3
19 D 3 4
20 D 4 5
21 E NA 1
22 E 3 2
23 E 3 3
24 E 4 4
25 E 5 5
我的目标是计算每x值的平均值(跨组) ,使用 mutate
。但是首先,我想过滤数据,以便仅保留那些x值,其中至少有3个非NA值。因此,在此示例中,我只想包括x至少为3的那些条目。我不知道如何创建 filter()
,有什么建议吗? / p>
My goal is to calculate the mean per x value (across groups), using mutate
. But first I'd like to filter the data, such that only those values of x remain for which there are at least 3 non-NA values. So in this example I only want to include those entries for which x is at least 3. I can't figure out how to create the filter()
, any suggestions?
推荐答案
您可以尝试
df %>%
group_by(group) %>% #group_by(x) %>% #as per the OP's clarification
filter(sum(!is.na(y))>=3) %>%
mutate(Mean=mean(x, na.rm=TRUE))
这篇关于在dplyr中基于NA进行过滤的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!
查看全文