如何使用 dplyr 在不丢失 NA 行的情况下过滤数据 [英] How to filter data without losing NA rows using dplyr
问题描述
以上使用逻辑索引的子集.有没有办法在 dplyr 中做到这一点?
The post above subsets using logical indexing. Is there a way to do it in dplyr?
另外,dplyr 什么时候自动删除 NAs? 根据我的经验,当我过滤掉一个特定的字符串时它会删除 NA,例如:
Also, when does dplyr automatically delete NAs? In my experience, it removes NA when I filter out a specific string, eg:
b = a %>% filter(col != "str")
我认为这不会排除 NA
值,但确实如此.但是当我使用其他格式的过滤时,它不会自动排除NA
,例如:
I would think this would not exclude NA
values but it does. But when I use other format of filtering, it does not automatically exclude NA
, eg:
b = a %>% filter(!grepl("str", col))
我想了解过滤器的这个特性.我将不胜感激任何帮助.谢谢!
I would like to understand this feature of filter. I would appreciate any help. Thank you!
推荐答案
dplyr::filter
的文档说...与基本子集不同,条件评估为 NA 的行被删除."
The documentation for dplyr::filter
says... "Unlike base subsetting, rows where the condition evaluates to NA are dropped."
NA != "str"
计算结果为 NA
,因此被 filter
删除.
NA != "str"
evaluates to NA
so is dropped by filter
.
!grepl("str", NA)
返回 TRUE
,因此被保留.
!grepl("str", NA)
returns TRUE
, so is kept.
如果你想让 filter
保留 NA
,你可以做 filter(is.na(col)|col!="str")
If you want filter
to keep NA
, you could do filter(is.na(col)|col!="str")
这篇关于如何使用 dplyr 在不丢失 NA 行的情况下过滤数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!