select和group_by关于引用变量之间的Dplyr差异? [英] Dplyr difference between select and group_by with respect to quoted variables?
问题描述
在当前版本的dplyr中,可以按值传递 select
参数:
In the current version of dplyr, select
arguments can be passed by value:
variable <- "Species"
iris %>%
select(variable)
# Species
#1 setosa
#2 setosa
#3 setosa
#4 setosa
#5 setosa
#6 setosa
#...
但是 group_by
参数不能按值传递:
But group_by
arguments cannot be passed by value:
iris %>%
group_by(variable) %>%
summarise(Petal.Length = mean(Petal.Length))
# Error in grouped_df_impl(data, unname(vars), drop) :
# Column `variable` is unknown
iris %>% select(Species)
iris %>%
group_by(Species) %>%
summarise(Petal.Length = mean(Petal.Length))
- 为什么
选择
和group_by
在按值传递参数方面有所不同吗? - 为什么第一个
select
调用为何 - 为什么第一个
group_by
调用为什么不起作用?我试图找出quo()
,enquo()
和的组合!!
我应该使用它来使其工作。 - Why are
select
andgroup_by
different with respect to passing arguments by value? - Why is the first
select
call working and will it continue to work in the future? - Why is the first
group_by
call not working? I'm trying to figure out what combination ofquo()
,enquo()
and!!
I should use to make it work.
我需要这个,因为我想创建一个函数以分组变量作为输入参数,如果可能的话,分组变量应作为字符串给出,因为另外两个函数参数已经作为字符串给出。
I need this because I would like to create a function that takes a grouping variable as input parameter, if possible the grouping variable should be given as a character string, because two other function parameters are already given as character strings.
推荐答案
要将字符串作为符号或未经评估的代码传递,您必须首先将其解析为符号或quosure。您可以使用<$ c $ rlang 中的 sym
或 parse_expr
来解析和以后使用 !!
取消引用:
To pass string as symbol or unevaluated code, you have to first parse it to symbol or quosure. You can use sym
or parse_expr
from rlang
to parse and later use !!
to unquote:
library(dplyr)
variable <- rlang::sym("Species")
# variable <- rlang::parse_expr("Species")
iris %>%
group_by(!! variable) %>%
summarise(Petal.Length = mean(Petal.Length))
!!
是 UQ()
的快捷方式,它不引用表达式或符号。这允许变量
仅在调用它的范围内进行评估,即 group_by
。
!!
is a shortcut for UQ()
, which unquotes the expression or symbol. This allows variable
to be evaluated only within the scope of where it is called, namely, group_by
.
sym
和 parse_expr
之间的差异,以及使用哪一个
Difference between sym
and parse_expr
and which one to use when?
简短的答案:在这种情况下没关系。
The short answer: it doesn't matter in this case.
长答案:
符号是引用R对象(基本上是对象的名称)的一种方式。因此 sym
与基R中的 as.name
相似。 parse_expr
另一方面将一些文本转换为R表达式。这类似于基本R中的 parse
。
A symbol is a way to refer to an R object, basically the "name" of an object. So sym
is similar to as.name
in base R. parse_expr
on the other hand transforms some text into R expressions. This is similar to parse
in base R.
表达式可以是 any R代码,而不是引用R对象的 just 代码。因此,您可以解析引用R对象的 code ,但是如果引用的对象确实可以将某些随机代码转换为 sym
不存在。
Expressions can be any R code, not just code that references R objects. So you can parse the code that references an R object, but you can't turn some random code into sym
if the object that it references does not exist.
通常,当字符串引用对象时,您将使用 sym
(尽管 parse_expr
也可以),并在尝试解析任何其他R代码时使用 parse_expr
In general, you will use sym
when your string refers to an object (although parse_expr
would also work), and use parse_expr
when you are trying to parse any other R code for further evaluation.
对于此特定用例,变量
应该是在引用一个对象,因此将其转换为 sym
即可。另一方面,将其解析为表达式也是可行的,因为当 code 被取消引用时,它将在 group_by
内部求值。 !!
。
For this particular use case, variable
is supposed to be referencing an object, so turning it into a sym
would work. On the other hand, parsing it as an expression would also work because that is the code that is going to be evaluated inside group_by
when being unquoted by !!
.
这篇关于select和group_by关于引用变量之间的Dplyr差异?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!