r在直方图ggplot中按bin的百分比 [英] r percentage by bin in histogram ggplot

查看:196
本文介绍了r在直方图ggplot中按bin的百分比的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个这样的数据集->

I have a data set like this ->

library(ggplot2)

response <- c("Yes","No")
gend <- c("Female","Male")

purchase <- sample(response, 20, replace = TRUE)
gender <- sample(gend, 20, replace = TRUE)

df <- as.data.frame(purchase)
df <- cbind(df,gender)

所以head(df)看起来像这样->

so head(df) looks like this ->

  purchase gender
1      Yes Female
2       No   Male
3       No Female
4       No Female
5      Yes Female
6       No Female

此外,因此您可以验证我的示例,这里是table(df)用于我的特定采样.
(请不要担心匹配我的百分比)

Also, so you can validate my examples, here is table(df) for my particular sampling.
(please don't worry about matching my percentages)

         gender
purchase Female Male
     No       6    3
     Yes      4    7

我想要一个显示性别的直方图",但按购买划分. 我走了这么远->

I want a "histogram" showing Gender, but split by Purchase. I have gone this far ->

ggplot(df) + 
       geom_bar(aes(y = (..count..)/sum(..count..)),position = "dodge") + 
       aes(gender, fill = purchase)

生成->

带有拆分箱的直方图,按百分比显示,但不是我想要的聚合级别

histogram with split bins, by percentage, but not the aggregate level I want

Y轴具有我想要的百分比,但是它具有图表中每个条形图占整个图表的百分比. 我想要的是两个女性"栏,分别占相应购买"栏的百分比.因此,在上图中,我希望有四个条形, 66%, 36%, 33%, 64% , 以该顺序.

The Y axis has Percentage as I want, but it has each bar of the chart as a percentage of the whole chart. What I want is the two "Female" bars to each be a percentage of there respective "Purchase". So in the chart above I would like four bars to be, 66%, 36%, 33%, 64% , in that order.

我尝试使用geom_histogram无济于事.我检查了SO,进行了搜索,ggplot文档和几本书.

I have tried with geom_histogram to no avail. I have checked SO, searched, ggplot documentation, and several books.

关于考虑前面有关构面的问题的建议;确实有效,但我希望将图表保持在上方,而不是拆分为两个图表".所以...

Regarding the suggestion to look at the previous question about facets; that does work, but I had hoped to keep the chart visually as it is above, as opposed to split into "two charts". So...

有人知道该怎么做吗?

谢谢.

推荐答案

关于所需百分比,分母是基于性别的还是购买的?在上面的示例中,女性占66%,没有购买将是6的结果除以没有购买的总和(6 + 3),而不是所有女性的总和(6 + 4).

Regarding the percentages you want, is the denominator based on gender, or purchase? In the example given above, 66% for female & no purchase would be a result of 6 divided by the sum of no purchases (6+3) rather than the sum of all females (6+4).

绝对可以将其绘制出来,但是我不确定结果是否直观易懂.我迷惑了一段时间.

It's definitely possible to plot that, but I'm not sure if the result would be intuitive to interpret. I got confused myself for a while.

以下技巧利用了weight美学.尽管我认为性别更有意义(根据TTNK的 answer 以上):

The following hack makes use of the weight aesthetic. I've used purchase as the grouping variable here based on the expected output described in the question, though I think gender makes more sense (as per TTNK's answer above):

df <- data.frame(purchase = c(rep("No", 6), rep("Yes", 4), rep("No", 3), rep("Yes", 7)),
                 gender = c(rep("Female", 10), rep("Male", 10)))

ggplot(df %>% 
         group_by(purchase) %>% #change this to gender if that's the intended denominator
         mutate(w = 1/n()) %>% ungroup()) + 
  aes(gender, fill = purchase, weight = w)+ 
  geom_bar(aes(x = gender, fill = purchase), position = "dodge")+
  scale_y_continuous(name = "percent", labels = scales::percent)

这篇关于r在直方图ggplot中按bin的百分比的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆