使用dplyr通过多个函数传递列名 [英] Passing column names through multiple functions with dplyr
问题描述
我写了一个简单的函数来创建 dplyr
的百分比表:
I wrote a simple function to create tables of percentages in dplyr
:
library(dplyr)
df = tibble(
Gender = sample(c("Male", "Female"), 100, replace = TRUE),
FavColour = sample(c("Red", "Blue"), 100, replace = TRUE)
)
quick_pct_tab = function(df, col) {
col_quo = enquo(col)
df %>%
count(!! col_quo) %>%
mutate(Percent = (100 * n / sum(n)))
}
df %>% quick_pct_tab(FavColour)
# Output:
# A tibble: 2 x 3
FavColour n Percent
<chr> <int> <dbl>
1 Blue 58 58
2 Red 42 42
这很好。但是,当我尝试以此为基础,编写一个新函数以分组方式计算相同的百分比时,我不知道如何在新函数中使用 quick_pct_tab
-在尝试了 quo(col)
,的多种不同组合之后! quo(col)
和 enquo(col)
等。
This works great. However, when I tried to build on top of this, writing a new function that calculated the same percentages with grouping, I could not figure out how to use quick_pct_tab
within the new function - after trying multiple different combinations of quo(col)
, !! quo(col)
and enquo(col)
, etc.
bygender_tab = function(df, col) {
col_enquo = enquo(col)
# Want to replace this with
# df %>% quick_pct_tab(col)
gender_tab = df %>%
group_by(Gender) %>%
count(!! col_enquo) %>%
mutate(Percent = (100 * n / sum(n)))
gender_tab %>%
select(!! col_enquo, Gender, Percent) %>%
spread(Gender, Percent)
}
> df %>% bygender_tab(FavColour)
# A tibble: 2 x 3
FavColour Female Male
* <chr> <dbl> <dbl>
1 Blue 52.08333 63.46154
2 Red 47.91667 36.53846
据我所知 dplyr
中的-standard评估已被弃用,因此很高兴学习如何使用 dplyr> 0.7
。我该如何引用 col
参数将其传递给进一步的 dplyr
函数?
From what I understand non-standard evaluation in dplyr
is deprecated so it would be great to learn how to achieve this using dplyr > 0.7
. How do I have to quote the col
argument to pass it through to a further dplyr
function?
推荐答案
我们需要执行 !!
来触发对 col_enquo的评估
We need to do !!
to trigger the evaluation of the 'col_enquo'
bygender_tab = function(df, col) {
col_enquo = enquo(col)
df %>%
group_by(Gender) %>%
quick_pct_tab(!!col_enquo) %>% ## change
select(!! col_enquo, Gender, Percent) %>%
spread(Gender, Percent)
}
df %>%
bygender_tab(FavColour)
# A tibble: 2 x 3
# FavColour Female Male
#* <chr> <dbl> <dbl>
#1 Blue 54.54545 41.07143
#2 Red 45.45455 58.92857
使用OP的功能,输出为
Using the OP's function, the output is
# A tibble: 2 x 3
# FavColour Female Male
#* <chr> <dbl> <dbl>
#1 Blue 54.54545 41.07143
#2 Red 45.45455 58.92857
请注意,未设置种子在创建数据集时
Note that the seed was not set while creating the dataset
具有 rlang
版本 0.4 .0
(与 dplyr
- 0.8.2
一起运行),我们还可以使用 {{...}}
进行报价,取消报价,替换
with rlang
version 0.4.0
(ran with dplyr
- 0.8.2
), we can also use the {{...}}
to do quote, unquote, substitution
bygender_tabN = function(df, col) {
df %>%
group_by(Gender) %>%
quick_pct_tab({{col}}) %>% ## change
select({{col}}, Gender, Percent) %>%
spread(Gender, Percent)
}
df %>%
bygender_tabN(FavColour)
# A tibble: 2 x 3
# FavColour Female Male
# <chr> <dbl> <dbl>
#1 Blue 50 46.3
#2 Red 50 53.7
-检查输出具有以前的功能(未提供set.seed)
-checking output with previous function (set.seed was not provided)
df %>%
bygender_tab(FavColour)
# A tibble: 2 x 3
# FavColour Female Male
# <chr> <dbl> <dbl>
#1 Blue 50 46.3
#2 Red 50 53.7
这篇关于使用dplyr通过多个函数传递列名的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!