如何从包含n * NA的数据框中删除行 [英] How to delete rows from a dataframe that contain n*NA
问题描述
NA
s 我的数据框看起来像这样:
ID qrstuvwxyz
A 1 5 NA 3 8 9 NA 8 6 4
B 5 NA 4 6 1 9 7 4 9 3
C NA 9 4 NA 4 8 4 NA 5 NA
D 2 2 6 8 4 NA 3 7 1 32
我希望能够删除包含2个以上包含NA的单元格的行得到
ID qrstuvwxyz
A 1 5 NA 3 8 9不适用8 6 4
B 5不适用4 6 1 9 7 4 9 3
D 2 2 6 8 4不适用3 7 1 32
$ b $ complete.cases
删除包含任何 NA
的所有行,我知道可以删除某些列中包含 NA
的行,但是有办法m对它进行修饰,以便对哪些列包含 NA
没有具体说明,但总数有多少? 另外,这个数据框是通过使用
file1< -read.delim(〜/ file1.txt )
file2< -read.delim(file = args [1])$ b
$ b file1< -merge(file1,file2,by =chr.pos,all = TRUE)
也许合并函数可以被修改?
感谢
rowSums
。要从数据框( df
)中删除精确包含 n NA
值的行: df < - df [rowSums(is.na(df))!= n,]
code>
或者删除包含 n 或更多 NA $ c的行$ c $ values:
df < - df [rowSums(is.na(df))< n,]
在两种情况下,都可以替换 n
所需的数字
I have a number of large datasets with ~10 columns, and ~200000 rows. Not all columns contain values for each row, although at least one column must contain a value for the row to be present, I would like to set a threshold for how many NA
s are allowed in a row.
My Dataframe looks something like this:
ID q r s t u v w x y z
A 1 5 NA 3 8 9 NA 8 6 4
B 5 NA 4 6 1 9 7 4 9 3
C NA 9 4 NA 4 8 4 NA 5 NA
D 2 2 6 8 4 NA 3 7 1 32
And I would like to be able to delete the rows that contain more than 2 cells containing NA to get
ID q r s t u v w x y z
A 1 5 NA 3 8 9 NA 8 6 4
B 5 NA 4 6 1 9 7 4 9 3
D 2 2 6 8 4 NA 3 7 1 32
complete.cases
removes all rows containing any NA
, and I know one can delete rows that contain NA
in certain columns but is there a way to modify it so that it is non-specific about which columns contain NA
, but how many of the total do?
Alternatively, this dataframe is generated by merging several dataframes using
file1<-read.delim("~/file1.txt")
file2<-read.delim(file=args[1])
file1<-merge(file1,file2,by="chr.pos",all=TRUE)
Perhaps the merge function could be altered?
Thanks
Use rowSums
. To remove rows from a data frame (df
) that contain precisely n NA
values:
df <- df[rowSums(is.na(df)) != n, ]
or to remove rows that contain n or more NA
values:
df <- df[rowSums(is.na(df)) < n, ]
in both cases of course replacing n
with the number that's required
这篇关于如何从包含n * NA的数据框中删除行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!