dplyr::select() 带有一些可能不存在于数据框中的变量? [英] dplyr::select() with some variables that may not exist in the data frame?
问题描述
我有一个辅助函数(比如 foo()
),它将在各种可能包含或不包含指定变量的数据帧上运行.假设我有
I have a helper function (say foo()
) that will be run on various data frames that may or may not contain specified variables. Suppose I have
library(dplyr)
d1 <- data_frame(taxon=1,model=2,z=3)
d2 <- data_frame(taxon=2,pss=4,z=3)
我要选择的变量是
vars <- intersect(names(data),c("taxon","model","z"))
也就是说,我希望 foo(d1)
返回 taxon
、model
和 z
> 列,而 foo(d2)
只返回 taxon
和 z
.
that is, I'd like foo(d1)
to return the taxon
, model
, and z
columns, while foo(d2)
returns just taxon
and z
.
如果 foo
包含 select(data,c(taxon,model,z))
那么 foo(d2)
失败(因为 d2
不包含 model
).如果我使用 select(data,-pss)
那么 foo(d1)
同样失败.
If foo
contains select(data,c(taxon,model,z))
then foo(d2)
fails (because d2
doesn't contain model
). If I use select(data,-pss)
then foo(d1)
fails similarly.
如果我退出 tidyverse,我知道如何执行此操作(只需返回 data[vars]
),但我想知道是否有一种方便的方法来执行此操作 (1)select()
某种类型的助手 (tidyselect::select_helpers
) 或 (2) 与 tidyeval(我仍然还没有找到时间把我的头转过来!)
I know how to do this if I retreat from the tidyverse (just return data[vars]
), but I'm wondering if there's a handy way to do this either (1) with a select()
helper of some sort (tidyselect::select_helpers
) or (2) with tidyeval (which I still haven't found time to get my head around!)
推荐答案
另一个选项是 select_if
:
d2 %>% select_if(names(.) %in% c('taxon', 'model', 'z'))
# # A tibble: 1 x 2
# taxon z
# <dbl> <dbl>
# 1 2 3
select_if
被取代.使用 any_of
代替:
select_if
is superseded. Use any_of
instead:
d2 %>% select(any_of(c('taxon', 'model', 'z')))
# # A tibble: 1 x 2
# taxon z
# <dbl> <dbl>
# 1 2 3
在 R 中输入 ?dplyr::select
你会发现:
type ?dplyr::select
in R and you will find this:
这些助手从字符向量中选择变量:
These helpers select variables from a character vector:
all_of():匹配字符向量中的变量名.所有名字都必须存在,否则抛出越界错误.
all_of(): Matches variable names in a character vector. All names must be present, otherwise an out-of-bounds error is thrown.
any_of():与 all_of() 相同,只是不为名称抛出错误不存在的.
any_of(): Same as all_of(), except that no error is thrown for names that don't exist.
这篇关于dplyr::select() 带有一些可能不存在于数据框中的变量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!