删除在某些列中具有所有NA的行 [英] Remove rows which have all NAs in certain columns

查看：114 发布时间：2020/10/16 21:34:43 r dataframe na

本文介绍了删除在某些列中具有所有NA的行的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

假设您有一个9列的数据框。您要删除在5：9栏中具有所有NA的案例。

Suppose you have a dataframe with 9 columns. You want to remove cases which have all NAs in columns 5:9. It's not at all relevant if there are NAs in columns 1:4.

到目前为止，我发现所有允许删除 any中具有NA的行的函数都无关紧要。 在第5：9列中，但是我特别需要删除那些在第5：9列中具有所有 NA的内容。

So far I have found functions that allow you to remove rows that have NAs in any of the columns 5:9, but I specifically need to remove only those that have all NAs in columns 5:9.

我编写了自己的函数来执行此操作，但是由于我有300k +行，因此速度非常慢。我想知道是否有更有效的方法？这是我的代码：

I wrote my own function to do this, but since I have 300k+ rows, it's very slow. I was wondering is there a more efficient way? This is my code:

remove.select.na<-function(x, cols){
  nrm<-vector("numeric")
  for (i in 1:nrow(x)){
    if (sum(is.na(x[i,cols]))<length(cols)){
      nrm<-c(nrm,i)
    }
    #Console output to track the progress
    cat('\r',paste0('Checking row ',i,' of ',nrow(x),' (', format(round(i/nrow(x)*100,2), nsmall = 2),'%).'))
    flush.console()
  }
  x<-x[nrm,]
  rm(nrm)
  return(x)
}

其中x是数据框，而cols是一个向量，其中包含应检查NA的列的名称。

where x is the dataframe and cols is a vector containing names of the columns that should be checked for NAs.

推荐答案

这是删除5至9之间所有列中带有NA的行的一种方法。通过组合 rowSums（）使用 is.na（）可以很容易地检查这5列中的所有条目是否都是 NA ：

This a one-liner to remove the rows with NA in all columns between 5 and 9. By combining rowSums() with is.na() it is easy to check whether all entries in these 5 columns are NA:

x <- x[rowSums(is.na(x[,5:9]))!=5,]

这篇关于删除在某些列中具有所有NA的行的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

删除在某些列中具有所有NA的行 [英] Remove rows which have all NAs in certain columns

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

删除在某些列中具有所有NA的行 [英] Remove rows which have all NAs in certain columns

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭