使用dplyr创建具有几个分类/因子变量的汇总比例表 [英] Using dplyr to create summary proportion table with several categorical/factor variables
问题描述
我正在尝试创建一个表,该表通过另一个变量总结了几个分类变量(使用频率和比例)。我想使用dplyr软件包来做到这一点。
这些先前的Stack Overflow讨论部分是我要寻找的内容:
使用tidyr / dplyr组合,这是您的操作方法:
库(tidyr)
库(dplyr)
mtcars%>%
收集(变量,值,齿轮,碳水化合物,cyl)%>%
group_by(am,variable,value)%>%
summary(n = n())%>%
mutate(freq = n / sum (n))
I am trying to create one table that summarizes several categorical variables (using frequencies and proportions) by another variable. I would like to do this using the dplyr package.
These previous Stack Overflow discussions have partially what I am looking for: Relative frequencies / proportions with dplyr and Calculate relative frequency for a certain group.
Using the mtcars dataset, this is what the output would look like if I just wanted to look at the proportion of gear
by am
category:
mtcars %>%
group_by(am, gear) %>%
summarise (n = n()) %>%
mutate(freq = n / sum(n))
# am gear n freq
# 1 0 3 15 0.7894737
# 2 0 4 4 0.2105263
# 3 1 4 8 0.6153846
# 4 1 5 5 0.3846154
However, I actually want to look at not only the gears
by am
, but also carb
by am
and cyl
by am
, separately, in the same table. If I amend the code to:
mtcars %>%
group_by (am, gear, carb, cyl) %>%
summarise (n = n()) %>%
mutate(freq = n / sum(n))
I get the frequencies for each combination of am
, gear
, carb
, and cyl
. Which is not what I want. Is there any way to do this with dplyr?
EDIT
Also, it would be an added bonus if anyone knew of a way to produce the table I want, but with the categories of am
as the columns (as in a classic 2x2 table format). Here is an example of what i'm referring to. It is from one of my previous publications. I want to produce this table in R, so that I can output it directly to a word document using RMarkdown:
With tidyr/dplyr combination, here is how you would do it:
library(tidyr)
library(dplyr)
mtcars %>%
gather(variable, value, gear, carb, cyl) %>%
group_by(am, variable, value) %>%
summarise (n = n()) %>%
mutate(freq = n / sum(n))
这篇关于使用dplyr创建具有几个分类/因子变量的汇总比例表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!