按比例汇总 [英] Summary of proportions by group
问题描述
什么是用于按子组计算比例的最佳工具/包?我以为我可以尝试这样的事情:
What would be the best tool/package to use to calculate proportions by subgroups? I thought I could try something like this:
data(mtcars)
library(plyr)
ddply(mtcars, .(cyl), transform, Pct = gear/length(gear))
但是输出不是我想要的,因为我想要一些等于 cyl
的行数。即使将其更改为总结
我仍然遇到同样的问题。
But the output is not what I want, as I would want something with a number of rows equal to cyl
. Even if change it to summarise
i still get the same problem.
我对其他软件包开放,但我认为 plyr
将是最好的,因为我最终会围绕这个建立一个功能。任何想法?
I am open to other packages, but I thought plyr
would be best as I would eventually like to build a function around this. Any ideas?
感谢任何帮助,只需解决这样的基本问题。
I'd appreciate any help just solving a basic problem like this.
推荐答案
要获取组内的频率:
library(dplyr)
mtcars %>% count(cyl, gear) %>% mutate(Freq = n/sum(n))
# Source: local data frame [8 x 4]
# Groups: cyl [3]
#
# cyl gear n Freq
# (dbl) (dbl) (int) (dbl)
# 1 4 3 1 0.09090909
# 2 4 4 8 0.72727273
# 3 4 5 2 0.18181818
# 4 6 3 2 0.28571429
# 5 6 4 4 0.57142857
# 6 6 5 1 0.14285714
# 7 8 3 12 0.85714286
# 8 8 5 2 0.14285714
或等价于
mtcars %>% group_by(cyl, gear) %>% summarise(n = n()) %>% mutate(Freq = n/sum(n))
仔细分析每个阶段的分组,或者你的nu mbers将关闭。
Careful of what the grouping is at each stage, or your numbers will be off.
这篇关于按比例汇总的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!