如果列包含NA，如何子集data.frame [英] How to subset a data.frame if the column contains NAs

查看：76 发布时间：2020/10/17 1:32:20 r dataframe

本文介绍了如果列包含NA，如何子集data.frame的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

R（版本3.3.3）在基于字符列的条件下为数据框设置子集时给我一些意外的行为。这是一个示例：

R (version 3.3.3) is giving me some unexpected behavior when subsetting a data frame on a condition based on a character column. Here is an example:

foo <- data.frame(bar = c('a',NA,'b','a'),
                  baz = 1:4,
                  stringsAsFactors = FALSE)

foo 看起来像这样：

   bar baz
1    a   1
2 <NA>   2
3    b   3
4    a   4

我想获取所有行该数据帧的位置，其中 bar！= a ，所以我称：

I want to get all rows of this data frame where bar != "a", so I call:

foo[foo$bar != 'a', ]

这将返回：

    bar baz
NA <NA>  NA
3     b   3

我不明白为什么第二栏中的第一项是 NA 而不是 2 。请帮我解释一下这种奇怪的行为。

I do not understand why the first entry in the second column is NA and not 2. Please help me explain this strange behavior.

推荐答案

虽然我试图了解这种行为，但正确/更好的方法R中的字符过滤器将使用％in％运算符。

While I'm trying to understand the behaviour, the right/better way to do character filter in R is to use %in% operator.

foo <- data.frame(bar = c('a',NA,'b','a'),
                  baz = 1:4,
                  stringsAsFactors = FALSE)

foo[!(foo$bar %in% 'a'), ]

输出：

> foo[!(foo$bar %in% 'a'), ]
   bar baz
2 <NA>   2
3    b   3

更新：

该行为不是由于字符过滤器引起的。这实际上是因为 NA 用于索引数据框。

The behaviour isn't because of character filter. It's actually because NA is used to index the dataframe.

> foo[c(F,NA,T,F),]
    bar baz
NA <NA>  NA
3     b   3

通过 NA 作为索引值用 NA

> foo[NA,]
      bar baz
NA   <NA>  NA
NA.1 <NA>  NA
NA.2 <NA>  NA
NA.3 <NA>  NA
> foo[c(T,NA),]
      bar baz
1       a   1
NA   <NA>  NA
3       b   3
NA.1 <NA>  NA

这篇关于如果列包含NA，如何子集data.frame的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

如果列包含NA，如何子集data.frame [英] How to subset a data.frame if the column contains NAs

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如果列包含NA，如何子集data.frame [英] How to subset a data.frame if the column contains NAs

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭