如何在ggplot2中绘制(复杂)堆叠的barplot,而无需复杂的手动数据聚合 [英] How to plot a (sophisticated) stacked barplot in ggplot2, without complicated manual data aggregation

查看:142
本文介绍了如何在ggplot2中绘制(复杂)堆叠的barplot,而无需复杂的手动数据聚合的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想绘制一个(多面)堆叠的barplot,其中X轴以百分比表示。另外频率标签显示在条形图中。

经过相当多的工作并在stackoverflow上查看了很多不同的问题之后,我找到了一个解决方案,用ggplot2解决这个问题。不过,我不直接用ggplot2来做,我手动将数据与表格调用进行汇总。我以一种复杂的方式进行手动聚合,并用临时变量手动计算百分比值(请参阅源代码注释手动聚合数据)。



我怎样才能做同样的阴谋,但以更好的方式没有手动和复杂的数据汇总?

  library(ggplot2)
库(比例)

库(gridExtra)
库(plyr)

##
##随机数据
##
fact1 < - factor(floor(runif(1000,1.6)),
labels = c(A,B,C,D,E))

fact2 < - factor(floor(runif(1000,1.6)),
labels = c(g1,g2,g3,g4, g5))

##
##将x轴缩放为100%的堆栈条图表
## b
$ b ##手动汇总数据
##
mytable< - as.data.frame(table(fact1,fact2))

colnames(mytable)< - c(caseStudyID,Group ,Freq)

mytable $ total< - sapply(mytable $ caseStudyID,
函数(caseID)sum(subset(mytable,caseStudyID == caseID)$ Freq))
$ b $ mytable $ percent < - round((mytable $ Freq / mytable $ total)* 100,2)

mytable2< - ddply(mytable,。(caseStudyID),transform,pos = cumsum(percent) - 0.5 * percent)


##所有案例研究在一个图中(缩小到100%)

p1 < - ggplot(mytable2,aes(x = caseStudyID,y = percent,fill = Group))+
geom_bar(stat =身份)+
主题(legend.key.size = unit(0.4,cm))+
theme(axis.text.x = element_text(angle = 60,hjust = 1))+
geom_text(aes(label = sapply(Freq,function(x)ifelse(x> 0,x,NA)),y = pos),size = 3)#ifelse防止打印标签为0在酒吧内


print(p1)

..

解决方案

在您创建数据后:

  fact1< - 因数(floor(runif(1000,1,6)),
labels = c(A,B,C,D,E))

fact2 < - factor(floor(runif(1000,1.6)),
labels = c(g1,g2,g3,g4,g5))
$ b $ dat = data.frame(caseStudyID = fact1,Group = fact2)

您可以使用 position_fill 自动制作您想要的那种未标记图形:

  ggplot(dat,aes(caseStudyID,fill = Group))+ geom_bar(position =fill)



不知道是否有办法自动生成文本标签。如果您想使用ggplot计算而不是单独执行,可以使用 ggplot_build 访问堆叠图形中的位置和计数。

  p = ggplot(dat,aes(caseStudyID,fill = Group))+ geom_bar(position =fill)
ggplot_build(p)$ data [[ 1]]

这将返回一个带有(除其他外)数据框, count x y ymin $ ymax 可用于创建定位标签的变量。

如果您希望垂直标签以每个类别为中心,首先在 ymin ymax 之间设置一个值。

  freq = ggplot_build(p)$ data [[1]] 
freq $ y_pos =(freq $ ymin + freq $ ymax)/ 2

然后用注释标签添加标签。 。

  p + annotate(x = freq $ x,y = freq $ y_pos,label = freq $ count,geom = TEX t,size = 3)


I want to plot a (facetted) stacked barplot where the X-Axis is in percent. Also the Frequency labels are displayed within the bars.

After quite some work and viewing many different questions on stackoverflow, I found a solution on how to solve this with ggplot2. However, I don't do it directly with ggplot2, I manually aggregate my data with a table call. And I do this manual aggregation in a complicated way and also calculate the percent values manually with temp variables (see source code comment "manually aggregate data").

How can I do the same plot, but in a nicer way without the manual and complicated data aggregation?

library(ggplot2)
library(scales)

library(gridExtra)
library(plyr)

##
##  Random Data
##
fact1 <- factor(floor(runif(1000, 1,6)),
                      labels = c("A","B", "C", "D", "E"))

fact2 <- factor(floor(runif(1000, 1,6)),
                      labels = c("g1","g2", "g3", "g4", "g5"))

##
##  STACKED BAR PLOT that scales x-axis to 100%
##

## manually aggregate data
##
mytable <- as.data.frame(table(fact1, fact2))

colnames(mytable) <- c("caseStudyID", "Group", "Freq")

mytable$total <- sapply(mytable$caseStudyID,
                        function(caseID) sum(subset(mytable, caseStudyID == caseID)$Freq))

mytable$percent <- round((mytable$Freq/mytable$total)*100,2)

mytable2 <- ddply(mytable, .(caseStudyID), transform, pos = cumsum(percent) - 0.5*percent)


## all case studies in one plot (SCALED TO 100%)

p1 <- ggplot(mytable2, aes(x=caseStudyID, y=percent, fill=Group)) +
    geom_bar(stat="identity") +
    theme(legend.key.size = unit(0.4, "cm")) +
    theme(axis.text.x = element_text(angle = 60, hjust = 1)) +
    geom_text(aes(label = sapply(Freq, function(x) ifelse(x>0, x, NA)), y = pos), size = 3) # the ifelse guards against printing labels with "0" within a bar


print(p1)

..

解决方案

After you make the data:

fact1 <- factor(floor(runif(1000, 1,6)),
                  labels = c("A","B", "C", "D", "E"))

fact2 <- factor(floor(runif(1000, 1,6)),
                  labels = c("g1","g2", "g3", "g4", "g5"))

dat = data.frame(caseStudyID=fact1, Group=fact2)

You can automate making an unlabeled graph of the kind that you want with position_fill:

ggplot(dat, aes(caseStudyID, fill=Group)) + geom_bar(position="fill")

I don't know if there's a way to generate the text labels automatically. The positions and counts from the stacked graph are accessible with ggplot_build, if you want to use what ggplot calculates instead of doing it separately.

p = ggplot(dat, aes(caseStudyID, fill=Group)) + geom_bar(position="fill")
ggplot_build(p)$data[[1]]

That will return a dataframe with (among other things), count, x, y, ymin, and ymax variables that can be used to create positioned labels.

If you want the labels vertically centered in each category, first make a column with values halfway between ymin and ymax.

freq = ggplot_build(p)$data[[1]]
freq$y_pos = (freq$ymin + freq$ymax) / 2

Then add the labels to the graph with annotate.

p + annotate(x=freq$x, y=freq$y_pos, label=freq$count, geom="text", size=3)

这篇关于如何在ggplot2中绘制(复杂)堆叠的barplot,而无需复杂的手动数据聚合的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆