按R中的子组百分比汇总 [英] Summarizing by subgroup percentage in R
问题描述
我有一个像这样的数据集:
I have a dataset like this:
df = data.frame(group = c(rep('A',4), rep('B',3)),
subgroup = c('a', 'b', 'c', 'd', 'a', 'b', 'c'),
value = c(1,4,2,1,1,2,3))
group | subgroup | value
------------------------
A | a | 1
A | b | 4
A | c | 2
A | d | 1
B | a | 1
B | b | 2
B | c | 3
我想要的是获取每个组中每个子组的值的百分比,即输出应为:
What I want is to get the percentage of the values of each subgroup within each group, i.e. the output should be:
group | subgroup | percent
------------------------
A | a | 0.125
A | b | 0.500
A | c | 0.250
A | d | 0.125
B | a | 0.167
B | b | 0.333
B | c | 0.500
A组,A组的示例:值是1,整个A组的总和是8(a = 1,b = 4,c = 2,d = 1)-因此1/8 = 0.125
Example for group A, subgroup A: the value was 1, the sum of the whole group A is 8 (a=1, b=4, c=2, d=1) - hence 1/8 = 0.125
So far I've only found fairly simple aggregates like this, but I cannot figure out how to do the "divide by a sum within a subgroup" part.
推荐答案
根据您的评论,如果子组是唯一的,则可以
Per your comment, if the subgroups are unique you can do
library(dplyr)
group_by(df, group) %>% mutate(percent = value/sum(value))
# group subgroup value percent
# 1 A a 1 0.1250000
# 2 A b 4 0.5000000
# 3 A c 2 0.2500000
# 4 A d 1 0.1250000
# 5 B a 1 0.1666667
# 6 B b 2 0.3333333
# 7 B c 3 0.5000000
或者要删除value
列并同时添加percent
列,请使用transmute
Or to remove the value
column and add the percent
column at the same time, use transmute
group_by(df, group) %>% transmute(subgroup, percent = value/sum(value))
# group subgroup percent
# 1 A a 0.1250000
# 2 A b 0.5000000
# 3 A c 0.2500000
# 4 A d 0.1250000
# 5 B a 0.1666667
# 6 B b 0.3333333
# 7 B c 0.5000000
这篇关于按R中的子组百分比汇总的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!