无需迭代即可汇总多个变量 [英] Summarising multiple variables without iteration

查看:36
本文介绍了无需迭代即可汇总多个变量的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑这个需要summary测量meansd对多个变量的data

Consider this data that needs the summary measures mean and sd on multiple variables,

# Create grouping var; ####
mtcars <- mtcars %>% mutate(
        am = case_when(
                am == 0 ~ "Automatic",
                TRUE ~ "Manual"
        )
)

使用下面的自定义functionpurrr,我可以创建一个baseline table

With the following custom function and purrr, I can create a baseline table,

# Summarising function; ####
sum_foo <- function(data, var) {
        
        data %>% 
                group_by(am) %>% 
                summarise(
                        mean = mean( !!sym(var) , na.rm = TRUE),
                        sd   = sd( !!sym(var) , na.rm = TRUE)
                ) %>% 
                mutate(across(where(is.double), round, 2)) %>%  
                group_by(am) %>% 
                transmute(
                        value = paste(mean, "(±", sd, ")", sep = ""),
                        variable = var
                ) %>%
                pivot_wider(
                        names_from = "am"
                )
        
        
}


# Execute Function; ####
sum_variables <- c("mpg", "hp", "disp")


sum_variables %>% map(
        sum_foo,
        data = mtcars
) %>% reduce(
        bind_rows
)

给出以下输出,

# A tibble: 3 x 3
  variable Automatic       Manual        
  <chr>    <chr>           <chr>         
1 mpg      17.15(±3.83)    24.39(±6.17)  
2 hp       160.26(±53.91)  126.85(±84.06)
3 disp     290.38(±110.17) 143.53(±87.2) 

我想在不使用 mapreduce 的情况下获得 output,即.无需使用 rowwisemap 遍历变量.

I want to get the output without using map and reduce, ie. without iterating through the variables with rowwise or map.

我正在寻找一个替代方案 tidyverse-解决方案!

I'm looking for an alternative tidyverse-solution!

推荐答案

也许你可以使用这个解决方案:

Maybe you could use this solution:

library(dplyr)
library(tidyr)
library(tibble)

sum_variables %>%
  enframe() %>%
  rowwise() %>%
  mutate(output = list(sum_foo(mtcars, value))) %>%
  select(output) %>%
  unnest(cols = output)

# A tibble: 3 x 3
  variable Automatic       Manual        
  <chr>    <chr>           <chr>         
1 mpg      17.15(±3.83)    24.39(±6.17)  
2 hp       160.26(±53.91)  126.85(±84.06)
3 disp     290.38(±110.17) 143.53(±87.2) 

已更新或者您甚至可以通过以下方式修改您的函数:

Updated Or you could even modify your function in the following way:

sum_foo2 <- function(data, var) {
  data %>% 
    group_by(am) %>% 
    summarise(across(all_of(var), list(Mean = mean, sd = sd))) %>% 
    mutate(across(where(is.double), round, 2)) %>%  
    group_by(am) %>%
    summarise(across(ends_with("Mean"), ~ paste(.x, "(±", get(gsub("_Mean", "_sd", cur_column())), ")", sep = ""))) %>%
    pivot_longer(!am, names_to = "Mean", values_to = "Val") %>%
    pivot_wider(names_from = "am", values_from = "Val")
}

sum_foo2(mtcars, sum_variables)

# A tibble: 3 x 3
  Mean      Automatic       Manual        
  <chr>     <chr>           <chr>         
1 mpg_Mean  17.15(±3.83)    24.39(±6.17)  
2 hp_Mean   160.26(±53.91)  126.85(±84.06)
3 disp_Mean 290.38(±110.17) 143.53(±87.2) 
 

如果我要将上面的函数修剪成更简洁的版本:

If I am to trim the function above into a more concise version:

sum_foo2 <- function(data, var) {
  data %>%
    group_by(am) %>%
    summarise(across(all_of(var), ~ paste0(round(mean(.x), 2), "(±", round(sd(.x), 2), ")"))) %>%
    pivot_longer(!am, names_to = "Mean", values_to = "Val") %>%
    pivot_wider(names_from = "am", values_from = "Val")
}

sum_foo2(mtcars, sum_variables)

这篇关于无需迭代即可汇总多个变量的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆