如何将多个group_by参数和动态变量参数传递给dplyr函数 [英] How to pass multiple group_by arguments and a dynamic variable argument to a dplyr function
问题描述
我试图将多个group_by参数传递给dplyr函数以及命名变量。在理解中,我需要对dplyr使用quosure来理解我要传递给它的变量。以下代码可以正常工作:
I am trying to pass multiple group_by arguments to a dplyr function as well as a named variable. In understand that I need to use a quosure for dplyr to understand the variables i am passing to it. The following code works fine:
quantileMaker2 <- function(data, groupCol, calcCol) {
groupCol <- enquo(groupCol)
calcCol <- enquo(calcCol)
data %>%
group_by(!! groupCol) %>%
summarise('25%' = currency(quantile(!! calcCol, probs = 0.25), digits = 2L),
'50%' = currency(quantile(!! calcCol, probs = 0.50), digits = 2L),
'75%' = currency(quantile(!! calcCol, probs = 0.75), digits = 2L),
avg = currency(mean(!! calcCol), digits = 2L),
nAgencies = n_distinct('POSIT ID'),
nFTEs = sum(FTEs)
)
}
quantileMaker2(df, employerClass, TCCperFTE)
但是,当我运行以下命令时,我遇到了问题:
However when I run the following I have a problem:
quantileMaker3 <- function(data,...,calcCol) {
groupCol <- quos(...)
calcCol <- quo(calcCol)
data %>%
group_by(!!! groupCol) %>%
summarise('25%' = currency(quantile(!! calcCol, probs = 0.25), digits = 2L),
'50%' = currency(quantile(!! calcCol, probs = 0.50), digits = 2L),
'75%' = currency(quantile(!! calcCol, probs = 0.75), digits = 2L),
avg = currency(mean(!! calcCol), digits = 2L),
nAgencies = n_distinct('POSIT ID'),
nFTEs = sum(FTEs)
)
}
哪个会返回以下错误:
Error in summarise_impl(.data, dots) :
Evaluation error: anyNA() applied to non-(list or vector) of type 'symbol'.
样本数据:
Year employerClass TCCperFTE FTEs POSIT ID
2014 One 5000 20 1
2014 Two 1000 30 2
2015 One 15000 40 1
2015 Two 50000 50 2
2016 One 100000 60 1
2016 Two 500000 70 2
任何帮助您可以
推荐答案
您尚未提供示例数据,但修改为使用 mtcars
数据框。
You haven't provided sample data, but your function works when modified to use the mtcars
data frame.
library(tidyverse)
library(formattable)
quantileMaker3 <- function(data, calcCol, ...) {
groupCol <- quos(...)
calcCol <- enquo(calcCol)
data %>%
group_by(!!!groupCol) %>%
summarise('25%' = currency(quantile(!!calcCol, probs = 0.25), digits = 2L),
'50%' = currency(quantile(!!calcCol, probs = 0.50), digits = 2L),
'75%' = currency(quantile(!!calcCol, probs = 0.75), digits = 2L),
avg = currency(mean(!!calcCol), digits = 2L),
nAgencies = n_distinct(cyl),
nFTEs = sum(hp)
)
}
quantileMaker3(mtcars, mpg, cyl)
# A tibble: 3 x 7
cyl `25%` `50%` `75%` avg nAgencies nFTEs
<dbl> <S3: formattable> <S3: formattable> <S3: formattable> <S3: formattable> <int> <dbl>
1 4. $22.80 $26.00 $30.40 $26.66 1 909.
2 6. $18.65 $19.70 $21.00 $19.74 1 856.
3 8. $14.40 $15.20 $16.25 $15.10 1 2929.
具有多个分组参数:
quantileMaker3(mtcars, mpg, cyl, vs)
# A tibble: 5 x 8
# Groups: cyl [?]
cyl vs `25%` `50%` `75%` avg nAgencies nFTEs
<dbl> <dbl> <S3: formattable> <S3: formattable> <S3: formattable> <S3: formattable> <int> <dbl>
1 4. 0. $26.00 $26.00 $26.00 $26.00 1 91.
2 4. 1. $22.80 $25.85 $30.40 $26.73 1 818.
3 6. 0. $20.35 $21.00 $21.00 $20.57 1 395.
4 6. 1. $18.03 $18.65 $19.75 $19.12 1 461.
5 8. 0. $14.40 $15.20 $16.25 $15.10 1 2929.
顺便说一句,您可以使用嵌套避免多次分位数调用。如果任何输出列属于 formattable
类(这是 currency
函数返回的内容),则此方法将不起作用,因此我更改了为货币格式列创建字符串的函数。
Incidentally, you can avoid multiple calls to quantile by using nesting. This won't work if any of the output columns are of class formattable
(which is what the currency
function returns), so I've changed the function to create strings for the currency-format columns.
quantileMaker3 <- function(data, calcCol, ..., quantiles=c(0.25,0.5,0.75)) {
groupCol <- quos(...)
calcCol <- enquo(calcCol)
data %>%
group_by(!!!groupCol) %>%
summarise(values = list(paste0("$", sprintf("%1.2f", quantile(!!calcCol, probs=quantiles)))),
qnames = list(sprintf("%1.0f%%", quantiles*100)),
nAgencies = n_distinct(cyl),
nFTEs = sum(hp),
avg = paste0("$", sprintf("%1.2f", mean(!!calcCol)))
) %>%
unnest %>%
spread(qnames, values)
}
quantileMaker3(mtcars, mpg, cyl, vs)
# A tibble: 5 x 8
# Groups: cyl [3]
cyl vs nAgencies nFTEs avg `25%` `50%` `75%`
<dbl> <dbl> <int> <dbl> <chr> <chr> <chr> <chr>
1 4. 0. 1 91. $26.00 $26.00 $26.00 $26.00
2 4. 1. 1 818. $26.73 $22.80 $25.85 $30.40
3 6. 0. 1 395. $20.57 $20.35 $21.00 $21.00
4 6. 1. 1 461. $19.12 $18.03 $18.65 $19.75
5 8. 0. 1 2929. $15.10 $14.40 $15.20 $16.25
这篇关于如何将多个group_by参数和动态变量参数传递给dplyr函数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!