为体内的dplyr参数提供多组变量给函数 [英] Supplying multiple groups of variables to a function for dplyr arguments in the body
问题描述
以下是数据:
library(tidyverse)
data <- tibble::tribble(
~var1, ~var2, ~var3, ~var4, ~var5,
"a", "d", "g", "hello", 1L,
"a", "d", "h", "hello", 2L,
"b", "e", "h", "k", 4L,
"b", "e", "h", "k", 7L,
"c", "f", "i", "hello", 3L,
"c", "f", "i", "hello", 4L
)
和向量,我要使用:
filter_var <- c("hello")
groupby_vars1 <- c("var1", "var2", "var3")
groupby_vars2 <- c("var1", "var2")
joinby_vars1 <- c("var1", "var2")
joinby_vars2 <- c("var1", "var2", "var3")
第二个&第5和第3&第四个向量相同,但是请假定它们是不同的,并将它们保留为不同的向量.
2nd & 5th, and 3rd & 4th vectors are same, but please assume they are different and retain them as different vectors.
现在,我想创建一个通用函数,在其中可以获取数据和这些向量以获取结果.
Now I want to create a generic function where I can take data and these vectors to get the results.
my_fun <- function(data, filter_var, groupby_vars1,groupby_vars2, joinby_vars1, joinby_vars2) {
data2 <- data %>% filter(var4 == filter_var)
data3 <- data2 %>%
group_by(groupby_vars1) %>%
summarise(var6 = sum(var5))
data4 <- data3 %>%
ungroup() %>%
group_by(groupby_vars2) %>%
summarise(avg = mean(var6,na.rm = T))
data5 <- data3 %>% left_join(data4, by = joinby_vars1)
data6 <- data %>% left_join(data5, by = joinby_vars2)
}
问题是向函数提供多个变量的多个向量,以用作体内的dplyr参数.我尝试查看 http://dplyr.tidyverse.org/articles/programming.html,但无法解决上述问题.
The problem is of supplying multiple vectors of multiple variables to a function to be used as dplyr arguments in the body. I tried looking into the http://dplyr.tidyverse.org/articles/programming.html, but could not solve the above problem.
推荐答案
group_by
不能将groupby_vars...
字符串作为输入.您需要使用rlang::syms()
将字符串向量转换为变量,然后使用!!!
取消对它们的引用,以便可以在group_by
group_by
cannot take groupby_vars...
strings as input. You need to use rlang::syms()
to turn string vector into variables then use !!!
to unquote them so that they can be evaluated inside group_by
library(tidyverse)
library(rlang)
data <- tibble::tribble(
~var1, ~var2, ~var3, ~var4, ~var5,
"a", "d", "g", "hello", 1L,
"a", "d", "h", "hello", 2L,
"b", "e", "h", "k", 4L,
"b", "e", "h", "k", 7L,
"c", "f", "i", "hello", 3L,
"c", "f", "i", "hello", 4L
)
filter_var <- c("hello")
groupby_vars1 <- c("var1", "var2", "var3")
groupby_vars2 <- c("var1", "var2")
joinby_vars1 <- c("var1", "var2")
joinby_vars2 <- c("var1", "var2", "var3")
my_fun <- function(data, filter_var,
groupby_vars1, groupby_vars2,
joinby_vars1, joinby_vars2) {
groupby_vars1 <- syms(groupby_vars1)
groupby_vars2 <- syms(groupby_vars2)
data2 <- data %>%
filter(var4 == filter_var)
data3 <- data2 %>%
group_by(!!! groupby_vars1) %>%
summarise(var6 = sum(var5))
data4 <- data3 %>%
ungroup() %>%
group_by(!!! groupby_vars2) %>%
summarise(avg = mean(var6, na.rm = TRUE))
data5 <- data3 %>%
left_join(data4, by = joinby_vars1)
data6 <- data %>%
left_join(data5, by = joinby_vars2)
return(data6)
}
my_fun(data, filter_var,
groupby_vars1, groupby_vars2,
joinby_vars1, joinby_vars2)
#> # A tibble: 6 x 7
#> var1 var2 var3 var4 var5 var6 avg
#> <chr> <chr> <chr> <chr> <int> <int> <dbl>
#> 1 a d g hello 1 1 1.5
#> 2 a d h hello 2 2 1.5
#> 3 b e h k 4 NA NA
#> 4 b e h k 7 NA NA
#> 5 c f i hello 3 7 7
#> 6 c f i hello 4 7 7
另一种实现方法:在外部使用parse_exprs
解析字符串向量,然后在函数内部取消引用.另请参见此
Another way to do it: parse the string vector using parse_exprs
outside then unquote them inside the function. See also this
my_fun2 <- function(data, filter_var,
groupby_vars1, groupby_vars2,
joinby_vars1, joinby_vars2) {
data2 <- data %>%
filter(var4 == filter_var)
data3 <- data2 %>%
group_by(!!! groupby_vars1) %>%
summarise(var6 = sum(var5))
data4 <- data3 %>%
ungroup() %>%
group_by(!!! groupby_vars2) %>%
summarise(avg = mean(var6, na.rm = TRUE))
data5 <- data3 %>%
left_join(data4, by = joinby_vars1)
data6 <- data %>%
left_join(data5, by = joinby_vars2)
return(data6)
}
my_fun2(data, filter_var,
parse_exprs(groupby_vars1), parse_exprs(groupby_vars2),
joinby_vars1, joinby_vars2)
identical(my_fun(data, filter_var,
groupby_vars1, groupby_vars2,
joinby_vars1, joinby_vars2),
my_fun2(data, filter_var,
parse_exprs(groupby_vars1), parse_exprs(groupby_vars2),
joinby_vars1, joinby_vars2))
[1] TRUE
由 reprex软件包(v0.2.0)创建于2018-04-24.
Created on 2018-04-24 by the reprex package (v0.2.0).
这篇关于为体内的dplyr参数提供多组变量给函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!