dplyr 中的标准评估:总结作为字符串给出的变量 [英] standard evaluation in dplyr: summarise a variable given as a character string

查看：30 发布时间：2021/12/1 21:12:06 r dplyr

本文介绍了dplyr 中的标准评估:总结作为字符串给出的变量的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

2020 年 7 月更新:

dplyr 1.0 几乎改变了关于这个问题的所有内容以及所有答案.在此处查看 dplyr 编程小插图:

dplyr 1.0 has changed pretty much everything about this question as well as all of the answers. See the dplyr programming vignette here:

https://cran.r-project.org/web/packages/dplyr/vignettes/programming.html

当列的标识符存储为字符向量时，引用列的新方法是使用 rlang 中的 .data 代词，然后像在子集一样使用基础 R.

The new way to refer to columns when their identifier is stored as a character vector is to use the .data pronoun from rlang, and then subset as you would in base R.

library(dplyr)

key <- "v3"
val <- "v2"
drp <- "v1"

df <- tibble(v1 = 1:5, v2 = 6:10, v3 = c(rep("A", 3), rep("B", 2)))

df %>% 
    select(-matches(drp)) %>% 
    group_by(.data[[key]]) %>% 
    summarise(total = sum(.data[[val]], na.rm = TRUE))

#> `summarise()` ungrouping output (override with `.groups` argument)
#> # A tibble: 2 x 2
#>   v3    total
#>   <chr> <int>
#> 1 A        21
#> 2 B        19

如果你的代码在一个包函数中，你可以@importFrom rlang .data 避免 R 检查关于未定义全局变量的注释.

If your code is in a package function, you can @importFrom rlang .data to avoid R check notes about undefined globals.

原始问题:

我想在 summarise 中引用一个未知的列名.dplyr 0.3 中引入的标准评估函数允许使用变量引用列名，但是当您在例如内部调用 base R 函数时，这似乎不起作用一个总结.

I want to refer to an unknown column name inside a summarise. The standard evaluation functions introduced in dplyr 0.3 allow column names to be referenced using variables, but this doesn't appear to work when you call a base R function within e.g. a summarise.

library(dplyr)
 
key <- "v3"
val <- "v2"
drp <- "v1"
 
df <- data_frame(v1 = 1:5, v2 = 6:10, v3 = c(rep("A", 3), rep("B", 2)))

df 看起来像这样:

> df
Source: local data frame [5 x 3]

  v1 v2 v3
1  1  6  A
2  2  7  A
3  3  8  A
4  4  9  B
5  5 10  B

我想去掉 v1，按 v3 分组，然后对每个组求和 v2:

I want to drop v1, group by v3, and sum v2 for each group:

df %>% select(-matches(drp)) %>% group_by_(key) %>% summarise_(sum(val, na.rm = TRUE))

Error in sum(val, na.rm = TRUE) : invalid 'type' (character) of argument

select() 的 NSE 版本工作正常，因为它可以匹配一个字符串.group_by() 的 SE 版本工作正常，因为它现在可以接受变量作为参数并评估它们.但是，在 dplyr 函数中使用基本 R 函数时，我还没有找到实现类似结果的方法.

The NSE version of select() works fine, since it can match a character string. The SE version of group_by() works fine, since it can now accept variables as arguments and evaluate them. However, I haven't found a way to achieve similar results when using base R functions inside dplyr functions.

行不通的事情:

df %>% group_by_(key) %>% summarise_(sum(get(val), na.rm = TRUE))
Error in get(val) : object 'v2' not found

df %>% group_by_(key) %>% summarise_(sum(eval(as.symbol(val)), na.rm = TRUE))
Error in eval(expr, envir, enclos) : object 'v2' not found

我检查了几个相关问题，但到目前为止，没有提出的解决方案对我有用.

I've checked out several related questions, but none of the proposed solutions have worked for me so far.

推荐答案

dplyr 1.0 几乎改变了关于这个问题的所有内容以及所有答案.在此处查看 dplyr 编程小插图:

dplyr 1.0 has changed pretty much everything about this question as well as all of the answers. See the dplyr programming vignette here:

https://cran.r-project.org/web/packages/dplyr/vignettes/programming.html

当列的标识符存储为字符向量时，引用列的新方法是使用 rlang 中的 .data 代词，然后像在子集一样使用基础 R.

The new way to refer to columns when their identifier is stored as a character vector is to use the .data pronoun from rlang, and then subset as you would in base R.

library(dplyr)

key <- "v3"
val <- "v2"
drp <- "v1"

df <- tibble(v1 = 1:5, v2 = 6:10, v3 = c(rep("A", 3), rep("B", 2)))

df %>% 
    select(-matches(drp)) %>% 
    group_by(.data[[key]]) %>% 
    summarise(total = sum(.data[[val]], na.rm = TRUE))

#> `summarise()` ungrouping output (override with `.groups` argument)
#> # A tibble: 2 x 2
#>   v3    total
#>   <chr> <int>
#> 1 A        21
#> 2 B        19

如果你的代码在一个包函数中，你可以@importFrom rlang .data 避免 R 检查关于未定义全局变量的注释.

If your code is in a package function, you can @importFrom rlang .data to avoid R check notes about undefined globals.

这篇关于dplyr 中的标准评估:总结作为字符串给出的变量的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

dplyr 中的标准评估:总结作为字符串给出的变量 [英] standard evaluation in dplyr: summarise a variable given as a character string

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

dplyr 中的标准评估:总结作为字符串给出的变量 [英] standard evaluation in dplyr: summarise a variable given as a character string

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭