dplyr - 使用变量名对多列进行分组 [英] dplyr - groupby on multiple columns using variable names
问题描述
我正在与 R Shiny 合作进行一些探索性数据分析.我有两个复选框输入,只包含用户选择的选项.第一个复选框输入仅包含分类变量;第二个复选框仅包含数字变量.接下来,我对这两个选择应用 groupby
:
I am working with R Shiny for some exploratory data analysis. I have two checkbox inputs that contain only the user-selected options. The first checkbox input contains only the categorical variables; the second checkbox contains only numeric variables. Next, I apply a groupby
on these two selections:
var1 <- input$variable1 # Checkbox with categorical variables
var2 <- input$variable2 # Checkbox with numerical variables
v$data <- dataset %>%
group_by_(var1) %>%
summarize_(Sum = interp(~sum(x), x = as.name(var2))) %>%
arrange(desc(Sum))
当只选择一个分类变量时,这个 groupby
工作得很好.选择多个分类变量时,此 groupby
返回具有列名称的数组.如何将这个列名数组传递给 dplyr
的 groupby
?
When only one categorical variable is selected, this groupby
works perfectly. When multiple categorical variables are chosen, this groupby
returns an array with column names. How do I pass this array of column names to dplyr
's groupby
?
推荐答案
dplyr version >1.0
对于更新的 dplyr
版本,您应该将 across
与 tidyselect 辅助函数一起使用.有关所有辅助函数的列表,请参阅 help("language", "tidyselect")
.在这种情况下,如果您想要字符向量中的所有列,请使用 all_of()
dplyr version >1.0
With more recent versions of dplyr
, you should use across
along with a tidyselect helper function. See help("language", "tidyselect")
for a list of all the helper functions. In this case if you want all columns in a character vector, use all_of()
cols <- c("mpg","hp","wt")
mtcars %>%
group_by(across(all_of(cols))) %>%
summarize(x=mean(gear))
原始答案(dplyr 的旧版本)
如果您有一个变量名称向量,您应该将它们传递给 group_by_
的 .dots=
参数.例如:
mtcars %>%
group_by_(.dots=c("mpg","hp","wt")) %>%
summarize(x=mean(gear))
这篇关于dplyr - 使用变量名对多列进行分组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!