R堆积百分比条形图,包含二元系数和标签百分比(含ggplot) [英] R stacked percentage bar plot with percentage of binary factor and labels (with ggplot)

查看:299
本文介绍了R堆积百分比条形图,包含二元系数和标签百分比(含ggplot)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想制作一张看起来像这样的图片:


我的原始数据集看起来像这样:

 > bb [sample(nrow(bb),20),] 
IMG QUANT FIX
25663 1 1 0
7936 2 2 0
23586 3 2 0
23017 2 2 1
31363 1 3 1
7886 2 2 0
23819 3 3 1
29838 2 2 1
8169 2 3 1
9870 2 3 0
31440 2 1 0
35564 3 1 0
24066 1 2 0
12020 3 2 0
6742 3 2 0
6189 2 3 0
26692 2 3 0
1387 3 2 0
31839 2 3 1
28637 3 2 0

所以我们的想法是,条形图显示 FIX = 1 每个因子 QUANT
因子 IMG



我使用<$ c将我的数据集合为百分比$ c> plyr

  library(plyr)
bb.perc< - ddply (bb,。(QUANT,IMG),summary,FIX.PROP = sum(FIX)/ length(FIX))

几乎就是吨的东西:

  QUANT IMG FIX.PROP 
1 1 1 0.52439024
2 1 2 0.19085366
3 1 3 0.13658537
4 2 1 0.20414201
5 2 2 0.53964497
6 2 3 0.09585799
7 3 1 0.29000000
8 3 2 0.13000000
9 3 3 0.40705882

但现在如果我制作一个图表,它不会考虑 FIX == 0 的情况下,即所有酒吧有相同的高度,即100%,这不是我想要的。请注意单个QUANT子帧的总和不能达到100%:

 > sum(bb.perc [1:3,] $ FIX.PROP)
[1] 0.8518293
> sum(bb.perc [4:6,] $ FIX.PROP)
[1] 0.839645
> sum(bb.perc [7:9,] $ FIX.PROP)
[1] 0.8270588



 #只带正面样本
bb .pos< - bb [bb $ FIX == 1,]
#绘制计数
ggplot(bb,aes(factor(QUANT),fill = factor(IMG)))+ geom_bar() +
scale_y_continous(labels = percent)

结果如下:
这也不是我想要的:




  • 百分比范围很小。我需要一种方法将100%的点传递给
    percent 函数,但我不知道如何。

  • 它缺少标签。



在SO上有很多类似的问题,但我似乎缺乏
足够的智力(或理解R)来推断他们的
来解决我的特殊问题。

感谢任何指针!



编辑:斯文海恩斯坦已经提供了一个答案,但这里是我自己也这样做的结果:

 > ggplot(bb.perc,aes(x = factor(QUANT),y = FIX.PROP,label = paste(round(FIX.PROP * 100),
%),fill = factor(IMG)) )+ geom_bar(stat =identity)+ geom_text(position =stack,
aes(ymax = 1),vjust = 5)+ scale_y_continuous(labels = percent)

使用 bb.perc ,我进一步使用 plyr 。这一个有
的优势,百分比是按本地每列计算的,而不是
全球。



感谢大家的帮助。以下两个问题和他们各自的
答案帮助我将它做得很好:


与ggplot2的堆叠条形图标签





我最初做错了什么? ,将 position =fill参数传递给
geom_bar(),由于某些原因所有的酒吧都有相同的高度!

解决方案

这是一种生成情节的方法:

  ggplot(bb [bb $ FIX == 1,],aes(x =因子(QUANT),fill =因子(IMG),
y = (..count ..)/ sum(.. count ..)))+
geom_bar()+
stat_bin(geom =text,
aes(label = paste(round ((..count ..)/ sum(.. count ..)* 100),%)),
vjust = 5) +
scale_y_continuous(labels = percent)

更改 vjust 参数来调整标签的垂直位置。




I want to produce a graphic that looks something like this:

My original data set looks something like this:

> bb[sample(nrow(bb), 20), ]
      IMG QUANT FIX
25663   1     1   0
7936    2     2   0
23586   3     2   0
23017   2     2   1
31363   1     3   1
7886    2     2   0
23819   3     3   1
29838   2     2   1
8169    2     3   1
9870    2     3   0
31440   2     1   0
35564   3     1   0
24066   1     2   0
12020   3     2   0
6742    3     2   0
6189    2     3   0
26692   2     3   0
1387    3     2   0
31839   2     3   1
28637   3     2   0

So the idea is that the bars display where FIX = 1 per factor QUANT and per factor IMG.

I've aggregated my data set into percentages using plyr

library(plyr)
bb.perc <- ddply(bb,.(QUANT,IMG),summarise,FIX.PROP = sum(FIX) / length(FIX))

It does almost the right thing:

  QUANT IMG   FIX.PROP
1     1   1 0.52439024
2     1   2 0.19085366
3     1   3 0.13658537
4     2   1 0.20414201
5     2   2 0.53964497
6     2   3 0.09585799
7     3   1 0.29000000
8     3   2 0.13000000
9     3   3 0.40705882

But now if I make a graph, it doesn't account for the FIX==0 cases, i.e. all bars have the same height, namely 100%, which isn't what I want. Note how the individual QUANT subframes don't add up to 100%:

> sum(bb.perc[1:3,]$FIX.PROP)
[1] 0.8518293
> sum(bb.perc[4:6,]$FIX.PROP)
[1] 0.839645
> sum(bb.perc[7:9,]$FIX.PROP)
[1] 0.8270588

The best I could do with R is to display counts:

# Take only the positive samples
bb.pos <- bb[bb$FIX == 1,]
# Plot the counts
ggplot(bb,aes(factor(QUANT),fill=factor(IMG))) + geom_bar() +
  scale_y_continous(labels=percent)

And results in: This is also not what I want:

  • The percentage scale is way off. I need a way to pass the 100% point to the percent function, but I have no idea how.
  • It lacks the labels.

There are a great deal of similar questions on SO already, but I seem to lack the sufficient amount of intelligence (or understanding of R) to extrapolate from them to a solution to my particular problem.

Thanks for any pointers!

EDIT: Sven Hohenstein provided an answer already, but here's how I ended up doing it myself as well:

> ggplot(bb.perc,aes(x=factor(QUANT),y=FIX.PROP,label=paste(round(FIX.PROP*100),
     "%"),fill=factor(IMG)))+ geom_bar(stat="identity") + geom_text(position="stack",
     aes(ymax=1),vjust=5) + scale_y_continuous(labels = percent)

Using the bb.perc that I defined further up using plyr. This one has the advantage that the percentages are computed locally per column, and not globally.

Thanks everyone for the help. The following two questions and their respective answers helped me greatly in getting it right:

Stacked Bar Graph Labels with ggplot2

Adding labels to ggplot bar chart

What I did wrong initially, was pass the position = "fill" parameter to geom_bar(), which for some reason made all the bars have the same height!

解决方案

This is a way to generate the plot:

ggplot(bb[bb$FIX == 1, ],aes(x = factor(QUANT), fill = factor(IMG), 
                             y = (..count..)/sum(..count..))) +
 geom_bar() +
 stat_bin(geom = "text",
          aes(label = paste(round((..count..)/sum(..count..)*100), "%")),
          vjust = 5) +
 scale_y_continuous(labels = percent)

Change the value of the vjust parameter to adjust the vertical position of the labels.

这篇关于R堆积百分比条形图,包含二元系数和标签百分比(含ggplot)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆