根据横向NA计数过滤数据 [英] Filter data.frame based on rowwise NA count
问题描述
我想根据每一行的NA数量过滤一个 data.frame
。
I would like to filter a data.frame
based on the number of NA's in each row.
如果我从以下开始,
> d
A B C E
1 2 2 6 7
2 4 9 NA 10
3 6 NA NA 4
4 9 7 1 8
我想过滤 d
以删除列中有两个或更多NA的行A,B和C产生:
I would like to filter d
to remove rows with 2 or more NA's in columns A, B, and C to yield:
A B C E
1 2 2 6 7
2 4 9 NA 10
4 9 7 1 8
推荐答案
为了重复性,请定义一个 data.frame
如下,各种数量的 NA
在每行中。
For reproducibility, define a data.frame
as below with various numbers of NA
s in each row.
df <- data.frame(
A = c(1, 2, 3, NA),
B = c(1, 2, NA, NA),
C = c(1, NA, NA, NA),
E = c(5, 6, 7, 8)
)
定义一个计数如果 NA
在给定行中的数字:
Define a function that counts the number if NA
's in a given row:
countNA <- function(df) apply(df, MARGIN = 1, FUN = function(x) length(x[is.na(x)]))
根据问题的措辞,排除列 E
从此计算:
Based on the wording of the question, exclude column E
from this calculation:
df_noE <- subset(df, select=-E)
现在计数 NA
在每行使用上面的功能:
Now count NA
s in each row using the function above:
na_count <- countNA(df_noE)
现在过滤原始的 data.frame
这个数:
Now filter the original data.frame
with this count:
df[na_count < 2,]
一起在一行中:
df[countNA(subset(df, select=-E)) < 2,]
这篇关于根据横向NA计数过滤数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!