排除整齐数据集中具有NA的组 [英] Exclude groups with NAs in tidy dataset
问题描述
我有一个整洁的 tibble
,其中一个值列由4个ID列标识。
I have a tidy tibble
with a value column identified by 4 ID columns.
> MWA
# A tibble: 16 x 5
# Groups: Dir [2]
VP Con Dir Seg time_seg
<int> <int> <int> <int> <int>
1 10 2 1 1 1810
2 10 2 1 2 260
3 10 2 1 3 540
4 10 2 1 4 1470
5 10 2 1 5 460
6 10 2 1 6 690
7 10 2 1 7 760
8 10 2 1 8 NA
9 10 2 2 1 320
10 10 2 2 2 1110
11 10 2 2 3 450
12 10 2 2 4 600
13 10 2 2 5 1680
14 10 2 2 6 730
15 10 2 2 7 850
16 10 2 2 8 840
dput
要复制的是
> dput(MWA)
structure(list(VP = c(10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L,
10L, 10L, 10L, 10L, 10L, 10L, 10L, 10L), Con = c(2L, 2L, 2L,
2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L), Dir = c(1L,
1L, 1L, 1L, 1L, 1L, 1L, 1L, 2L, 2L, 2L, 2L, 2L, 2L, 2L, 2L),
Seg = c(1L, 2L, 3L, 4L, 5L, 6L, 7L, 8L, 1L, 2L, 3L, 4L, 5L,
6L, 7L, 8L), time_seg = c(1810L, 260L, 540L, 1470L, 460L,
690L, 760L, NA, 320L, 1110L, 450L, 600L, 1680L, 730L, 850L,
840L)), row.names = c(NA, -16L), class = c("grouped_df",
"tbl_df", "tbl", "data.frame"), vars = "Dir", drop = TRUE, indices = list(
0:7, 8:15), group_sizes = c(8L, 8L), biggest_group_size = 8L, labels = structure(list(
Dir = 1:2), row.names = c(NA, -2L), class = "data.frame", vars = "Dir", drop = TRUE))
它们来自更大的数据集,在其中按<$ c $分组c> VP , Con
,最后是 Dir
。
They stem from a larger data set, where they have been grouped by VP
, Con
and finally Dir
.
如您所见,在第10小标题行中有一个 NA
。
As you can see, in tibble row 10 there is a NA
.
根据这种情况,我现在想排除整个 Dir
组(所以第1行到第8行)使用 dplyr
缺少该值。
I now want to exclude the whole Dir
group (so rows 1 trough 8), based on this condition that this one value is missing using dplyr
.
使用过滤器$ c $
is.na
或 complete.cases
的c>仅删除 NA
,而不是完整的组(在此数据集中是一个案例。)。
Using the filter
with is.na
or complete.cases
only removes the row with the NA
, not the complete group (which is one "case" in this dataset).
推荐答案
使用 all()
将评估整个组,因此您可以跳过 mutate
步骤。
Using all()
will evaluate the entire group, so you can skip the mutate
step.
MWA %>%
group_by(Dir) %>%
filter(all(!is.na(time_seg)))
# A tibble: 8 x 5
# Groups: Dir [1]
VP Con Dir Seg time_seg
<int> <int> <int> <int> <int>
1 10 2 2 1 320
2 10 2 2 2 1110
3 10 2 2 3 450
4 10 2 2 4 600
5 10 2 2 5 1680
6 10 2 2 6 730
7 10 2 2 7 850
8 10 2 2 8 840
这篇关于排除整齐数据集中具有NA的组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!