汇总不同聚合级别的数据-R和tidyverse [英] Summarize data at different aggregate levels - R and tidyverse

查看:154
本文介绍了汇总不同聚合级别的数据-R和tidyverse的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在创建一堆基本状态报告,而我发现乏味的事情之一是向我的所有表添加总计行.我目前正在使用Tidyverse方法,这是我当前代码的一个示例.我要寻找的是默认包含一些不同级别的选项.

I'm creating a bunch of basic status reports and one of things I'm finding tedious is adding a total row to all my tables. I'm currently using the Tidyverse approach and this is an example of my current code. What I'm looking for is an option to have a few different levels included by default.

#load into RStudio viewer (not required)
iris = iris

#summary at the group level
summary_grouped = iris %>% 
       group_by(Species) %>%
       summarize(mean_s_length = mean(Sepal.Length),
                 max_s_width = max(Sepal.Width))

#summary at the overall level
summary_overall = iris %>% 
  summarize(mean_s_length = mean(Sepal.Length),
            max_s_width = max(Sepal.Width)) %>%
  mutate(Species = "Overall")

#append results for report       
summary_table = rbind(summary_grouped, summary_overall)

多次执行此操作非常繁琐.我有点想要:

Doing this multiple times over is very tedious. I kind of want:

summary_overall = iris %>% 
       group_by(Species, total = TRUE) %>%
       summarize(mean_s_length = mean(Sepal.Length),
                 max_s_width = max(Sepal.Width))

仅供参考-如果您熟悉SAS,我正在寻找通过proc中的类,方式或类型语句提供的相同类型的功能,这意味着我可以控制汇总级别并在一个调用中获得多个级别.

FYI - if you're familiar with SAS I'm looking for the same type of functionality available via a class, ways or types statements in proc means that let me control the level of summarization and get multiple levels in one call.

感谢您的帮助.我知道我可以创建自己的函数,但希望已经存在了一些东西.我还希望坚持使用整洁的编程风格,尽管我对此没有设置.

Any help is appreciated. I know I can create my own function, but was hoping there is something that already exists. I would also prefer to stick with the tidyverse style of programming though I'm not set on that.

推荐答案

另一种选择:

library(tidyverse)  

iris %>% 
  mutate_at("Species", as.character) %>%
  list(group_by(.,Species), .) %>%
  map(~summarize(.,mean_s_length = mean(Sepal.Length),
                 max_s_width = max(Sepal.Width))) %>%
  bind_rows() %>%
  replace_na(list(Species="Overall"))
#> # A tibble: 4 x 3
#>   Species    mean_s_length max_s_width
#>   <chr>              <dbl>       <dbl>
#> 1 setosa              5.01         4.4
#> 2 versicolor          5.94         3.4
#> 3 virginica           6.59         3.8
#> 4 Overall             5.84         4.4

这篇关于汇总不同聚合级别的数据-R和tidyverse的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆