如何参数化 dplyr 0.7 中的函数调用? [英] How to parametrize function calls in dplyr 0.7?

查看:15
本文介绍了如何参数化 dplyr 0.7 中的函数调用?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

dplyr 0.7 的发布包括对使用 dplyr 编程的重大改革.我仔细阅读了这份文档,我正在努力了解它将如何影响我对 dplyr 的使用.

The release of dplyr 0.7 includes a major overhaul of programming with dplyr. I read this document carefully, and I am trying to understand how it will impact my use of dplyr.

这是我在使用 dplyr 构建报告和聚合函数时使用的常用习惯用法:

Here is a common idiom I use when building reporting and aggregation functions with dplyr:

my_report <- function(data, grouping_vars) {
  data %>%
    group_by_(.dots=grouping_vars) %>%
    summarize(x_mean=mean(x), x_median=median(x), ...)
}

这里,grouping_vars 是一个字符串向量.

Here, grouping_vars is a vector of strings.

我喜欢这个习惯用法,因为我可以从其他地方传入字符串向量,比如文件或 Shiny 应用程序的响应式 UI,但对于交互式工作来说也不算太糟糕.

I like this idiom because I can pass in string vectors from other places, say a file or a Shiny app's reactive UI, but it's also not too bad for interactive work either.

但是,在新的 使用 dplyr 小插图编程 中,我看不到如何使用的示例这样的事情可以用新的 dplyr 来完成.我只看到传递字符串不再是正确方法的例子,我不得不使用 quosures 来代替.

However, in the new programming with dplyr vignette, I see no examples of how something like this can be done with the new dplyr. I only see examples of how passing strings is no longer the correct approach, and I have to use quosures instead.

我很高兴采用 quosures,但是我究竟如何从字符串到 dplyr 期望的 quosures 呢?期望整个 R 生态系统向 dplyr 提供 quosures 似乎是不可行的 - 很多时候我们将获得字符串并且必须对其进行转换.

I'm happy to adopt quosures, but how exactly do I get from strings to the quosures expected by dplyr here? It doesn't seem feasible to expect the entire R ecosystem to provide quosures to dplyr - lots of times we're going to get strings and they'll have to be converted.

这是一个示例,显示您现在应该做什么,以及我的旧习语如何不起作用:

Here is an example showing what you're now supposed to do, and how my old idiom doesn't work:

library(dplyr)
grouping_vars <- quo(am)
mtcars %>%
  group_by(!!grouping_vars) %>%
  summarise(mean_cyl=mean(cyl))
#> # A tibble: 2 × 2
#>      am mean_cyl
#>   <dbl>    <dbl>
#> 1     0 6.947368
#> 2     1 5.076923

grouping_vars <- "am"
mtcars %>%
  group_by(!!grouping_vars) %>%
  summarise(mean_cyl=mean(cyl))
#> # A tibble: 1 × 2
#>   `"am"` mean_cyl
#>    <chr>    <dbl>
#> 1     am   6.1875

推荐答案

dplyr 将有一个专门的 group_by 函数 group_by_at 来处理多个分组变量.使用 _at 家族的新成员会容易得多:

dplyr will have a specialized group_by function group_by_at to deal with multiple grouping variables. It would be much easier to use the new member of the _at family:

# using the pre-release 0.6.0

cols <- c("am","gear")

mtcars %>%
    group_by_at(.vars = cols) %>%
    summarise(mean_cyl=mean(cyl))

# Source: local data frame [4 x 3]
# Groups: am [?]
# 
# am  gear mean_cyl
# <dbl> <dbl>    <dbl>
# 1     0     3 7.466667
# 2     0     4 5.000000
# 3     1     4 4.500000
# 4     1     5 6.000000

.vars 参数接受由 vars 生成的字符/数字向量或列名:

The .vars argument accepts both character/numeric vector or column names generated by vars:

.vars

由 vars() 生成的列列表,或列名称,或列位置的数值向量.

A list of columns generated by vars(), or a character vector of column names, or a numeric vector of column positions.

这篇关于如何参数化 dplyr 0.7 中的函数调用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆