如何将表示表达式的字符串传递给 dplyr 0.7 动词? [英] How to pass strings denoting expressions to dplyr 0.7 verbs?

查看:14
本文介绍了如何将表示表达式的字符串传递给 dplyr 0.7 动词?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想了解如何将表示表达式的字符串传递给 dplyr,以便字符串中提到的变量被评估为数据帧中列上的表达式.关于这个主题的 主要小插图涵盖了传递 quosures,根本不讨论字符串.

I would like to understand how to pass strings representing expressions into dplyr, so that the variables mentioned in the string are evaluated as expressions on columns in the dataframe. The main vignette on this topic covers passing in quosures, and doesn't discuss strings at all.

很明显,在表示表达式时,quosures 比字符串更安全、更清晰,因此当可以使用 quosures 代替时,我们当然应该避免使用字符串.但是,在使用 R 生态系统之外的工具(例如 javascript 或 YAML 配置文件)时,通常必须使用字符串而不是 quosure.

It's clear that quosures are safer and clearer than strings when representing expressions, so of course we should avoid strings when quosures can be used instead. However, when working with tools outside the R ecosystem, such as javascript or YAML config files, one will often have to work with strings instead of quosures.

例如,假设我想要一个使用用户/调用者传入的表达式进行分组计数的函数.正如预期的那样,以下代码不起作用,因为 dplyr 使用非标准评估来解释 group_by 的参数.

For example, say I want a function that does a grouped tally using expressions passed in by the user/caller. As expected, the following code doesn't work, since dplyr uses nonstandard evaluation to interpret the arguments to group_by.

library(tidyverse)

group_by_and_tally <- function(data, groups) {
  data %>%
    group_by(groups) %>%
    tally()
}

my_groups <- c('2 * cyl', 'am')
mtcars %>%
  group_by_and_tally(my_groups)
#> Error in grouped_df_impl(data, unname(vars), drop): Column `groups` is unknown

在 dplyr 0.5 中,我们将使用标准评估,例如 group_by_(.dots = groups),来处理这种情况.既然下划线动词被弃用了,那么在 dplyr 0.7 中我们应该如何做这种事情?

In dplyr 0.5 we would use standard evaluation, such as group_by_(.dots = groups), to handle this situation. Now that the underscore verbs are deprecated, how should we do this kind of thing in dplyr 0.7?

在表达式只是列名的特殊情况下,我们可以使用解决方案 这个问题,但它们不适用于更复杂的表达式,例如 2 * cyl 不仅仅是列名.

In the special case of expressions that are just column names we can use the solutions to this question, but they don't work for more complex expressions like 2 * cyl that aren't just a column name.

推荐答案

需要注意的是,在这个简单的示例中,我们可以控制表达式的创建方式.所以传递表达式的最好方法是直接使用 quos() 构造和传递 quosures:

It's important to note that, in this simple example, we have control of how the expressions are created. So the best way to pass the expressions is to construct and pass quosures directly using quos():

library(tidyverse)
library(rlang)

group_by_and_tally <- function(data, groups) {
  data %>%
    group_by(UQS(groups)) %>%
    tally()
}

my_groups <- quos(2 * cyl, am)
mtcars %>%
  group_by_and_tally(my_groups)
#> # A tibble: 6 x 3
#> # Groups:   2 * cyl [?]
#>   `2 * cyl`    am     n
#>       <dbl> <dbl> <int>
#> 1         8     0     3
#> 2         8     1     8
#> 3        12     0     4
#> 4        12     1     3
#> 5        16     0    12
#> 6        16     1     2

但是,如果我们以字符串的形式从外部源接收表达式,我们可以先简单地解析表达式,将它们转换为quosures:

However, if we receive the expressions from an outside source in the form of strings, we can simply parse the expressions first, which converts them to quosures:

my_groups <- c('2 * cyl', 'am')
my_groups <- my_groups %>% map(parse_quosure)
mtcars %>%
  group_by_and_tally(my_groups)
#> # A tibble: 6 x 3
#> # Groups:   2 * cyl [?]
#>   `2 * cyl`    am     n
#>       <dbl> <dbl> <int>
#> 1         8     0     3
#> 2         8     1     8
#> 3        12     0     4
#> 4        12     1     3
#> 5        16     0    12
#> 6        16     1     2

同样,我们应该只在从以字符串形式提供表达式的外部源获取表达式时才应该这样做 - 否则我们应该直接在 R 源代码中创建 quosures.

Again, we should only do this if we are getting expressions from an outside source that provides them as strings - otherwise we should make quosures directly in the R source code.

这篇关于如何将表示表达式的字符串传递给 dplyr 0.7 动词?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆