在堆积条形图内标记选定的百分比值(ggplot2) [英] Label selected percentage values inside stacked bar plot (ggplot2)

查看:174
本文介绍了在堆积条形图内标记选定的百分比值(ggplot2)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想将百分比标签放在我的堆积条形图上。不过,我只想标出每个酒吧的最大3个百分比。我在SO上浏览了很多有用的帖子(例如: 1 2 3 ) ,这是我到目前为止完成的工作:

  library(ggplot2)
groups< -factor(rep (c(1,2,3,4,5,6,Missing),4))
site< -c(rep(Site1, (754,6982,6296,16152,6416,2301),代表(Site2,7),代表(Site3,7),代表(Site4,7) ,0,
20704,10385,22041,27596,4648,1325,0,
17200,11950,11836,12303,2817,911,1,
2580,2620,2828,2839 ,507,152,2)
tapply(count,site,sum)
tot <-c(rep(45701,7),rep(86699,7),rep(57018,7),rep(11528 ,7))
prop< -sprintf(%。1f %%,计数/总数* 100)

数据<-data.frame(组,站点,计数,道具)

ggplot(数据,aes(x =站点,y =计数,fill = groups))+ geom_bar()+
stat_bin(geom =text,aes(y = counts,label = prop),vjust = 1)+
scale_y_continuous(labels = percent)

我想在这里插入我的输出图像,但似乎没有足够的声望......但是上面的代码应该能够产生该图。



那么,我怎样才能标出每个栏上最大的3个百分比?另外,对于这个传说,我可以改变这些类别的顺序吗?例如,首先放置丢失。这不是一个大问题,但对于我的真实数据集来说,图例中类别的顺序真的让我困扰。

我是这个网站的新成员,所以如果有什么不清楚我的问题,请让我知道,我会解决它。我很欣赏任何答案/评论!谢谢!

解决方案

我以某种哈克式的方式做到了这一点。它不是那么优雅。

无论如何,我使用 plyr 包,因为split-apply-combine战略似乎是去这里的路。



我用一个变量 perc 重新创建了数据框,每个网站的百分比。然后,对于每个网站,我只保留了 prop 的3个最大值,并用

 #我添加了一些变量,并添加了stringsAsFactors = FALSE 
data< - data.frame(groups,site, count,tot,perc = counts / tot,
prop,stringsAsFactors = FALSE)

#载入plyr
库(plyr)
#分割站点变量,并保留所有其他变量(是否存在
#选项以将所有变量保留在最终结果中?)
data2 < - ddply(data,〜site,summarize,
groups = groups ,
counts = counts,
perc = perc,
prop = ifelse(perc%in%sort(perc,decrease = TRUE)[1:3],prop,))

#我改变了一些绘图参数
ggplot(data2,aes(x = site,y = perc,fill = groups))+ geom_bar()+
stat_bin(geom =text,aes(y = perc,label = prop),vjust = 1)+
scale_y_continuous(labels = percent)



编辑:看起来你的尺度在原始绘图代码中是错误的。它给了我y轴上的7500000%的结果,这对我来说似乎有点... ...

编辑:我修正了代码。


I want to put labels of the percentages on my stacked bar plot. However, I only want to label the largest 3 percentages for each bar. I went through a lot of helpful posts on SO (for example: 1, 2, 3), and here is what I've accomplished so far:

library(ggplot2)
groups<-factor(rep(c("1","2","3","4","5","6","Missing"),4))
site<-c(rep("Site1",7),rep("Site2",7),rep("Site3",7),rep("Site4",7))
counts<-c(7554,6982, 6296,16152,6416,2301,0,
          20704,10385,22041,27596,4648, 1325,0,
          17200, 11950,11836,12303, 2817,911,1,
          2580,2620,2828,2839,507,152,2)
tapply(counts,site,sum)
tot<-c(rep(45701,7),rep(86699,7), rep(57018,7), rep(11528,7))
prop<-sprintf("%.1f%%", counts/tot*100)

data<-data.frame(groups,site,counts,prop)

ggplot(data, aes(x=site, y=counts,fill=groups)) + geom_bar()+
  stat_bin(geom = "text",aes(y=counts,label = prop),vjust = 1) +
  scale_y_continuous(labels = percent)

I wanted to insert my output image here but don't seem to have enough reputation...But the code above should be able to produce the plot.

So how can I only label the largest 3 percentages on each bar? Also, for the legend, is it possible for me to change the order of the categories? For example put "Missing" at the first. This is not a big issue here but for my real data set, the order of the categories in the legend really bothers me.

I'm new on this site, so if there's anything that's not clear about my question, please let me know and I will fix it. I appreciate any answer/comments! Thank you!

解决方案

I did this in a sort of hacky manner. It isn't that elegant.

Anyways, I used the plyr package, since the split-apply-combine strategy seemed to be the way to go here.

I recreated your data frame with a variable perc that represents the percentage for each site. Then, for each site, I just kept the 3 largest values for prop and replaced the rest with "".

# I added some variables, and added stringsAsFactors=FALSE
data <- data.frame(groups, site, counts, tot, perc=counts/tot,
                   prop, stringsAsFactors=FALSE)

# Load plyr
library(plyr)
# Split on the site variable, and keep all the other variables (is there an
# option to keep all variables in the final result?)
data2 <- ddply(data, ~site, summarize, 
               groups=groups,
               counts=counts, 
               perc=perc,
               prop=ifelse(perc %in% sort(perc, decreasing=TRUE)[1:3], prop, ""))

# I changed some of the plotting parameters
ggplot(data2, aes(x=site, y=perc, fill=groups)) + geom_bar()+
  stat_bin(geom = "text", aes(y=perc, label = prop),vjust = 1) +
  scale_y_continuous(labels = percent)

EDIT: Looks like your scales are wrong in your original plotting code. It gave me results with 7500000% on the y axis, which seemed a little off to me...

EDIT: I fixed up the code.

这篇关于在堆积条形图内标记选定的百分比值(ggplot2)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆