dplyr中的标准评估:总结以字符串形式给出的变量 [英] standard evaluation in dplyr: summarise a variable given as a character string

查看:102
本文介绍了dplyr中的标准评估:总结以字符串形式给出的变量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

更新2020年7月:

dplyr 1.0几乎也改变了有关此问题的所有内容作为所有答案。在此处查看 dplyr 编程小插图:

dplyr 1.0 has changed pretty much everything about this question as well as all of the answers. See the dplyr programming vignette here:

https://cran.r-project.org/web/packages/dplyr/vignettes/programming.html

将列的标识符存储为字符向量时引用列的新方法是使用 rlang <的 .data 代词/ code>,然后像在基数R中一样设置子集。

The new way to refer to columns when their identifier is stored as a character vector is to use the .data pronoun from rlang, and then subset as you would in base R.

library(dplyr)

key <- "v3"
val <- "v2"
drp <- "v1"

df <- tibble(v1 = 1:5, v2 = 6:10, v3 = c(rep("A", 3), rep("B", 2)))

df %>% 
    select(-matches(drp)) %>% 
    group_by(.data[[key]]) %>% 
    summarise(total = sum(.data[[val]], na.rm = TRUE))

#> `summarise()` ungrouping output (override with `.groups` argument)
#> # A tibble: 2 x 2
#>   v3    total
#>   <chr> <int>
#> 1 A        21
#> 2 B        19

如果代码在包函数中,则可以 @ importFrom rlang .data 避免对未定义的全局变量进行R检查。

If your code is in a package function, you can @importFrom rlang .data to avoid R check notes about undefined globals.

原始问题:

我想引用摘要中的未知列名。 dplyr 0.3 中引入的标准评估函数允许使用变量来引用列名,但是当您调用 base <时这似乎不起作用。 / code> R函数,例如摘要

I want to refer to an unknown column name inside a summarise. The standard evaluation functions introduced in dplyr 0.3 allow column names to be referenced using variables, but this doesn't appear to work when you call a base R function within e.g. a summarise.

library(dplyr)
 
key <- "v3"
val <- "v2"
drp <- "v1"
 
df <- data_frame(v1 = 1:5, v2 = 6:10, v3 = c(rep("A", 3), rep("B", 2)))

df看起来像这样:

> df
Source: local data frame [5 x 3]

  v1 v2 v3
1  1  6  A
2  2  7  A
3  3  8  A
4  4  9  B
5  5 10  B

我要删除v1,按v3分组,以及每个组的v2:

I want to drop v1, group by v3, and sum v2 for each group:

df %>% select(-matches(drp)) %>% group_by_(key) %>% summarise_(sum(val, na.rm = TRUE))

Error in sum(val, na.rm = TRUE) : invalid 'type' (character) of argument

NSE版本的 select()可以正常工作,因为它可以匹配一个字符串。 group_by()的SE版本可以正常工作,因为它现在可以接受变量作为参数并对其求值。但是,在 dplyr 函数中使用基本R函数时,我还没有找到一种实现类似结果的方法。

The NSE version of select() works fine, since it can match a character string. The SE version of group_by() works fine, since it can now accept variables as arguments and evaluate them. However, I haven't found a way to achieve similar results when using base R functions inside dplyr functions.

不能t工作:

df %>% group_by_(key) %>% summarise_(sum(get(val), na.rm = TRUE))
Error in get(val) : object 'v2' not found

df %>% group_by_(key) %>% summarise_(sum(eval(as.symbol(val)), na.rm = TRUE))
Error in eval(expr, envir, enclos) : object 'v2' not found

我已经签出几个 相关 问题,但到目前为止,所提出的解决方案均不适用于我。

I've checked out several related questions, but none of the proposed solutions have worked for me so far.

推荐答案

dplyr 1.0具有几乎解答了有关此问题的所有内容以及所有答案。在此处查看 dplyr 编程小插图:

dplyr 1.0 has changed pretty much everything about this question as well as all of the answers. See the dplyr programming vignette here:

https://cran.r-project.org/web/packages/dplyr/vignettes/programming.html

将列的标识符存储为字符向量时引用列的新方法是使用 rlang <的 .data 代词/ code>,然后像在基数R中一样设置子集。

The new way to refer to columns when their identifier is stored as a character vector is to use the .data pronoun from rlang, and then subset as you would in base R.

library(dplyr)

key <- "v3"
val <- "v2"
drp <- "v1"

df <- tibble(v1 = 1:5, v2 = 6:10, v3 = c(rep("A", 3), rep("B", 2)))

df %>% 
    select(-matches(drp)) %>% 
    group_by(.data[[key]]) %>% 
    summarise(total = sum(.data[[val]], na.rm = TRUE))

#> `summarise()` ungrouping output (override with `.groups` argument)
#> # A tibble: 2 x 2
#>   v3    total
#>   <chr> <int>
#> 1 A        21
#> 2 B        19

如果代码在包函数中,则可以 @ importFrom rlang .data 避免对未定义的全局变量进行R检查。

If your code is in a package function, you can @importFrom rlang .data to avoid R check notes about undefined globals.

这篇关于dplyr中的标准评估:总结以字符串形式给出的变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆