r在直方图ggplot中按bin的百分比 [英] r percentage by bin in histogram ggplot
问题描述
我有一个这样的数据集->
I have a data set like this ->
library(ggplot2)
response <- c("Yes","No")
gend <- c("Female","Male")
purchase <- sample(response, 20, replace = TRUE)
gender <- sample(gend, 20, replace = TRUE)
df <- as.data.frame(purchase)
df <- cbind(df,gender)
所以head(df)
看起来像这样->
so head(df)
looks like this ->
purchase gender
1 Yes Female
2 No Male
3 No Female
4 No Female
5 Yes Female
6 No Female
此外,因此您可以验证我的示例,这里是table(df)
用于我的特定采样.
(请不要担心匹配我的百分比)
Also, so you can validate my examples, here is table(df)
for my particular sampling.
(please don't worry about matching my percentages)
gender
purchase Female Male
No 6 3
Yes 4 7
我想要一个显示性别的直方图",但按购买划分. 我走了这么远->
I want a "histogram" showing Gender, but split by Purchase. I have gone this far ->
ggplot(df) +
geom_bar(aes(y = (..count..)/sum(..count..)),position = "dodge") +
aes(gender, fill = purchase)
生成->
带有拆分箱的直方图,按百分比显示,但不是我想要的聚合级别
histogram with split bins, by percentage, but not the aggregate level I want
Y轴具有我想要的百分比,但是它具有图表中每个条形图占整个图表的百分比.
我想要的是两个女性"栏,分别占相应购买"栏的百分比.因此,在上图中,我希望有四个条形,
66%, 36%, 33%, 64%
, 以该顺序.
The Y axis has Percentage as I want, but it has each bar of the chart as a percentage of the whole chart.
What I want is the two "Female" bars to each be a percentage of there respective "Purchase". So in the chart above I would like four bars to be,
66%, 36%, 33%, 64%
, in that order.
我尝试使用geom_histogram无济于事.我检查了SO,进行了搜索,ggplot文档和几本书.
I have tried with geom_histogram to no avail. I have checked SO, searched, ggplot documentation, and several books.
关于考虑前面有关构面的问题的建议;确实有效,但我希望将图表保持在上方,而不是拆分为两个图表".所以...
Regarding the suggestion to look at the previous question about facets; that does work, but I had hoped to keep the chart visually as it is above, as opposed to split into "two charts". So...
有人知道该怎么做吗?
谢谢.
推荐答案
关于所需百分比,分母是基于性别的还是购买的?在上面的示例中,女性占66%,没有购买将是6的结果除以没有购买的总和(6 + 3),而不是所有女性的总和(6 + 4).
Regarding the percentages you want, is the denominator based on gender, or purchase? In the example given above, 66% for female & no purchase would be a result of 6 divided by the sum of no purchases (6+3) rather than the sum of all females (6+4).
绝对可以将其绘制出来,但是我不确定结果是否直观易懂.我迷惑了一段时间.
It's definitely possible to plot that, but I'm not sure if the result would be intuitive to interpret. I got confused myself for a while.
以下技巧利用了weight
美学.尽管我认为性别更有意义(根据TTNK的 answer 以上):
The following hack makes use of the weight
aesthetic. I've used purchase as the grouping variable here based on the expected output described in the question, though I think gender makes more sense (as per TTNK's answer above):
df <- data.frame(purchase = c(rep("No", 6), rep("Yes", 4), rep("No", 3), rep("Yes", 7)),
gender = c(rep("Female", 10), rep("Male", 10)))
ggplot(df %>%
group_by(purchase) %>% #change this to gender if that's the intended denominator
mutate(w = 1/n()) %>% ungroup()) +
aes(gender, fill = purchase, weight = w)+
geom_bar(aes(x = gender, fill = purchase), position = "dodge")+
scale_y_continuous(name = "percent", labels = scales::percent)
这篇关于r在直方图ggplot中按bin的百分比的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!