创建动态分组依据 [英] Creating a dynamic Group By
问题描述
df = data.frame(
A = c(1, 4, 5, 13, 2),
B = c("Group 1", "Group 3", "Group 2", "Group 1", "Group 2"),
C = c("Group 3", "Group 2", "Group 1", "Group 2", "Group 3")
)
df %>%
group_by(B) %>%
summarise(val = mean(A))
df %>%
group_by(C) %>%
summarise(val = mean(A))
我不想为每个 group_by
的唯一集合编写新的代码块,而是创建一个循环遍历 df
数据框,然后将结果保存到列表或数据框中。
Instead of writing a new chunck of code for each unique set of group_by
I would like to create a loop that would iterate through the df
data frame and save the results into a list or a data frame.
我想看看特征 A 的平均值如何随特征 B 和 C ,而不必为数据集中的每个分类功能编写新的代码块。
I would like to see how the average value of feature A is spread acorss features B and C, without having to write a new chunck of code for each categorical feature in the data set.
我尝试了此操作:
List_Of_Groups <- map_df(df, function(i) {
df %>%
group_by(!!!syms(names(df)[1:i])) %>%
summarize(newValue = mean(A))
})
推荐答案
使用 purrr
的 map
,您可以将指定的代码块应用于所有字符列。基本上,您将字符变量的名称映射到随后的函数
Using purrr
's map
, you can apply the chunk of code you specified to all the columns that are character. Basically you map the names of character variables to the function that follows
purrr::map(names(df %>% select(where(is.character))), function(i) {
df %>%
group_by(!!sym(i)) %>%
summarize(newValue = mean(A))
})
输出
# [[1]]
# A tibble: 3 x 2
# B newValue
# <chr> <dbl>
# 1 Group 1 7
# 2 Group 2 3.5
# 3 Group 3 4
#
# [[2]]
# A tibble: 3 x 2
# C newValue
# <chr> <dbl>
# 1 Group 1 5
# 2 Group 2 8.5
# 3 Group 3 1.5
这篇关于创建动态分组依据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!