删除data.frame中具有全部或部分NA(缺失值)的行 [英] Remove rows with all or some NAs (missing values) in data.frame
问题描述
我想删除此数据框中的行:
I'd like to remove the lines in this data frame that:
a)在所有列中均包含NA
..以下是我的示例数据框.
a) contain NA
s across all columns. Below is my example data frame.
gene hsap mmul mmus rnor cfam
1 ENSG00000208234 0 NA NA NA NA
2 ENSG00000199674 0 2 2 2 2
3 ENSG00000221622 0 NA NA NA NA
4 ENSG00000207604 0 NA NA 1 2
5 ENSG00000207431 0 NA NA NA NA
6 ENSG00000221312 0 1 2 3 2
基本上,我想获取如下数据框.
Basically, I'd like to get a data frame such as the following.
gene hsap mmul mmus rnor cfam
2 ENSG00000199674 0 2 2 2 2
6 ENSG00000221312 0 1 2 3 2
b)仅在某些列中包含NA
,所以我也可以得到以下结果:
b) contain NA
s in only some columns, so I can also get this result:
gene hsap mmul mmus rnor cfam
2 ENSG00000199674 0 2 2 2 2
4 ENSG00000207604 0 NA NA 1 2
6 ENSG00000221312 0 1 2 3 2
推荐答案
也请检查 na.omit
对于删除所有NA
更好. complete.cases
通过仅包括数据框的某些列来允许部分选择:
na.omit
is nicer for just removing all NA
's. complete.cases
allows partial selection by including only certain columns of the dataframe:
> final[complete.cases(final[ , 5:6]),]
gene hsap mmul mmus rnor cfam
2 ENSG00000199674 0 2 2 2 2
4 ENSG00000207604 0 NA NA 1 2
6 ENSG00000221312 0 1 2 3 2
您的解决方案无法正常工作.如果您坚持使用is.na
,则必须执行以下操作:
Your solution can't work. If you insist on using is.na
, then you have to do something like:
> final[rowSums(is.na(final[ , 5:6])) == 0, ]
gene hsap mmul mmus rnor cfam
2 ENSG00000199674 0 2 2 2 2
4 ENSG00000207604 0 NA NA 1 2
6 ENSG00000221312 0 1 2 3 2
,但是使用complete.cases
更加清晰,更快.
but using complete.cases
is quite a lot more clear, and faster.
这篇关于删除data.frame中具有全部或部分NA(缺失值)的行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!