具有函数列名称的dplyr [英] dplyr with name of columns in a function

查看：94 发布时间：2020/10/26 3:29:40 r function dplyr

本文介绍了具有函数列名称的dplyr的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

无法弄清楚如何在使用 dplyr R 包的函数中使用列名。可重现的示例如下：

Not able to figure out how to use column names in a function using dplyr R package. Reproducible example is below:

数据

set.seed(12345)
Y <- rnorm(10)
Env <- paste0("E", rep(1:2, each = 5))
Gen <- paste0("G", rep(1:5, times = 2))
df1 <- data.frame(Y, Env, Gen)

外部功能工作

library(dplyr)
  df1 %>%
    dplyr::group_by(E, G) %>%
    dplyr::summarize(mean(Y))

with(data = df1, expr = tapply(X = Y, INDEX = list(E, G), FUN = mean))

第一个函数

fn1 <- function(Y, E, G, data){
  Y <- deparse(substitute(Y))
  E <- deparse(substitute(E))
  G <- deparse(substitute(G))
  Out <- with(data = data, tapply(X = Y, INDEX = list(E, G), FUN = mean), parent.frame())
  return(Out)
}  

fn1(Y = Y, E = Env, G = Gen, data = df1)

tapply错误（X = Y，INDEX = list（E，G），FUN = Mean）：参数
必须具有相同的长度

Error in tapply(X = Y, INDEX = list(E, G), FUN = mean) : arguments must have same length

第二功能

fn2 <- function(Y, E, G, data){
  Y <- deparse(substitute(Y))
  E <- deparse(substitute(E))
  G <- deparse(substitute(G))
  library(dplyr)
  Out <- df1 %>%
    dplyr::group_by(E, G) %>%
    dplyr::summarize(mean(Y))
  return(Out)
}  

fn2(Y = Y, E = Env, G = Gen, data = df1)

grouped_df_impl中的错误（data，unname （vars，drop）：列 E 是
未知

推荐答案

一种选择是使用 enquo 在 quosure 对象，可以在 group_by ，摘要，变异等，可使用 !! 运算符或 UQ （unquote expression）


One option would to use the enquo to capture the expression and its environment in a quosure object which can be evaluated within the group_by, summarise, mutate etc by using !! operator or UQ (unquote expression)    
fn2 <- function(Y, E, G, data){
 E <- enquo(E)
 G <- enquo(G)
 Y <- enquo(Y)
 data %>%
    dplyr::group_by(!! E, !! G) %>%
    dplyr::summarize(Y = mean(!!Y))

}

fn2(Y, E = Env, G = Gen, df1)
# A tibble: 10 x 3
# Groups: Env [?]
#   Env    Gen         Y
#   <fctr> <fctr>  <dbl>
# 1 E1     G1      0.586
# 2 E1     G2      0.709
# 3 E1     G3     -0.109
# 4 E1     G4     -0.453
# 5 E1     G5      0.606
# 6 E2     G1     -1.82 
# 7 E2     G2      0.630
# 8 E2     G3     -0.276
# 9 E2     G4     -0.284
#10 E2     G5     -0.919

在Op的函数中，表达式由<$ c捕获$ c>替代，并用删除将其转换为字符串。通过使用 rlang 中的 sym ，可以将其转换为符号，然后使用进行评估！！ 或 UQ

In the Op's function, while the expression is captured by substitute, with deparse, it is converted to a string. By using sym from rlang, this can be converted to symbol and then evaluated with !! or UQ as above

fn2 <- function(Y, E, G, data){
   Y <- deparse(substitute(Y))
   E <- deparse(substitute(E))
   G <- deparse(substitute(G))

   df1 %>%
    dplyr::group_by(!!rlang::sym(E), !! rlang::sym(G)) %>%
    dplyr::summarize(Y = mean(!! rlang::sym(Y)))

}  

fn2(Y = Y, E = Env, G = Gen, data = df1)

OP函数的另一个变体而不使用 rlang 将使用 group_by_at 或 summarise_at 可以将字符串作为参数

Another variant of the OP's function without using rlang would be to make use of group_by_at or summarise_at which can take strings as argument

fn3 <- function(Y, E, G, data){
  Y <- deparse(substitute(Y))
  E <- deparse(substitute(E))
  G <- deparse(substitute(G))

   df1 %>%
    dplyr::group_by_at(vars(E, G)) %>%
    dplyr::summarize_at(vars(Y), mean)

}  

fn3(Y = Y, E = Env, G = Gen, data = df1)

这篇关于具有函数列名称的dplyr的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

具有函数列名称的dplyr [英] dplyr with name of columns in a function

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

具有函数列名称的dplyr [英] dplyr with name of columns in a function

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭