dplyr 中的标准评估:总结作为字符串给出的变量 [英] standard evaluation in dplyr: summarise a variable given as a character string

查看:30
本文介绍了dplyr 中的标准评估:总结作为字符串给出的变量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

2020 年 7 月更新:

dplyr 1.0 几乎改变了关于这个问题的所有内容以及所有答案.在此处查看 dplyr 编程小插图:

dplyr 1.0 has changed pretty much everything about this question as well as all of the answers. See the dplyr programming vignette here:

https://cran.r-project.org/web/packages/dplyr/vignettes/programming.html

当列的标识符存储为字符向量时,引用列的新方法是使用 rlang 中的 .data 代词,然后像在子集一样使用基础 R.

The new way to refer to columns when their identifier is stored as a character vector is to use the .data pronoun from rlang, and then subset as you would in base R.

library(dplyr)

key <- "v3"
val <- "v2"
drp <- "v1"

df <- tibble(v1 = 1:5, v2 = 6:10, v3 = c(rep("A", 3), rep("B", 2)))

df %>% 
    select(-matches(drp)) %>% 
    group_by(.data[[key]]) %>% 
    summarise(total = sum(.data[[val]], na.rm = TRUE))

#> `summarise()` ungrouping output (override with `.groups` argument)
#> # A tibble: 2 x 2
#>   v3    total
#>   <chr> <int>
#> 1 A        21
#> 2 B        19

如果你的代码在一个包函数中,你可以@importFrom rlang .data 避免 R 检查关于未定义全局变量的注释.

If your code is in a package function, you can @importFrom rlang .data to avoid R check notes about undefined globals.

原始问题:

我想在 summarise 中引用一个未知的列名.dplyr 0.3 中引入的标准评估函数允许使用变量引用列名,但是当您在例如内部调用 base R 函数时,这似乎不起作用一个总结.

I want to refer to an unknown column name inside a summarise. The standard evaluation functions introduced in dplyr 0.3 allow column names to be referenced using variables, but this doesn't appear to work when you call a base R function within e.g. a summarise.

library(dplyr)
 
key <- "v3"
val <- "v2"
drp <- "v1"
 
df <- data_frame(v1 = 1:5, v2 = 6:10, v3 = c(rep("A", 3), rep("B", 2)))

df 看起来像这样:

> df
Source: local data frame [5 x 3]

  v1 v2 v3
1  1  6  A
2  2  7  A
3  3  8  A
4  4  9  B
5  5 10  B

我想去掉 v1,按 v3 分组,然后对每个组求和 v2:

I want to drop v1, group by v3, and sum v2 for each group:

df %>% select(-matches(drp)) %>% group_by_(key) %>% summarise_(sum(val, na.rm = TRUE))

Error in sum(val, na.rm = TRUE) : invalid 'type' (character) of argument

select() 的 NSE 版本工作正常,因为它可以匹配一个字符串.group_by() 的 SE 版本工作正常,因为它现在可以接受变量作为参数并评估它们.但是,在 dplyr 函数中使用基本 R 函数时,我还没有找到实现类似结果的方法.

The NSE version of select() works fine, since it can match a character string. The SE version of group_by() works fine, since it can now accept variables as arguments and evaluate them. However, I haven't found a way to achieve similar results when using base R functions inside dplyr functions.

行不通的事情:

df %>% group_by_(key) %>% summarise_(sum(get(val), na.rm = TRUE))
Error in get(val) : object 'v2' not found

df %>% group_by_(key) %>% summarise_(sum(eval(as.symbol(val)), na.rm = TRUE))
Error in eval(expr, envir, enclos) : object 'v2' not found

我检查了几个 相关 问题,但到目前为止,没有提出的解决方案对我有用.

I've checked out several related questions, but none of the proposed solutions have worked for me so far.

推荐答案

dplyr 1.0 几乎改变了关于这个问题的所有内容以及所有答案.在此处查看 dplyr 编程小插图:

dplyr 1.0 has changed pretty much everything about this question as well as all of the answers. See the dplyr programming vignette here:

https://cran.r-project.org/web/packages/dplyr/vignettes/programming.html

当列的标识符存储为字符向量时,引用列的新方法是使用 rlang 中的 .data 代词,然后像在子集一样使用基础 R.

The new way to refer to columns when their identifier is stored as a character vector is to use the .data pronoun from rlang, and then subset as you would in base R.

library(dplyr)

key <- "v3"
val <- "v2"
drp <- "v1"

df <- tibble(v1 = 1:5, v2 = 6:10, v3 = c(rep("A", 3), rep("B", 2)))

df %>% 
    select(-matches(drp)) %>% 
    group_by(.data[[key]]) %>% 
    summarise(total = sum(.data[[val]], na.rm = TRUE))

#> `summarise()` ungrouping output (override with `.groups` argument)
#> # A tibble: 2 x 2
#>   v3    total
#>   <chr> <int>
#> 1 A        21
#> 2 B        19

如果你的代码在一个包函数中,你可以@importFrom rlang .data 避免 R 检查关于未定义全局变量的注释.

If your code is in a package function, you can @importFrom rlang .data to avoid R check notes about undefined globals.

这篇关于dplyr 中的标准评估:总结作为字符串给出的变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆