如何在ggplot中标记堆积的直方图 [英] How to label stacked histogram in ggplot

查看:1190
本文介绍了如何在ggplot中标记堆积的直方图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图在柱状图中为栏中的颜色添加相应的标签。这是一个可重复的代码。

  ggplot(aes(displ),data = mpg)+ geom_histogram(aes(fill = class) ,binwidth = 1,col =black)



我认为这样做并不意味着重复标签,但它可能更有用,显示每个组的频率:

  ggplot(mpg_df,aes(x = bin,y = Freq,fill = class))+ 
geom_bar(stat = 标识,颜色=黑色,宽度= 1)+
geom_text(aes(label = ifelse(Freq> = 4,Freq,)),
position = position_stack(vjust = 0.5),color =black)



更新



我意识到您实际上可以选择使用内部ggplot函数 .. ..计数。无需预先格式化数据!

  ggplot(mpg,aes(x = displ,fill = class,label = class)) + 
geom_histogram(binwidth = 1,col =black)+
stat_bin(binwidth = 1,geom =text,position = position_stack(vjust = 0.5),aes(label = ifelse .count ..> 4,..count ..,)))

对于解释ggplot中的特殊变量很有用: ggplot中的特殊变量(..count。 。,..density ..等)



第二种方法只有在您想用数字标记数据集时才有效。如果您想通过类或其他参数标记数据集,则必须使用第一种方法预先生成数据帧。


I am trying to add corresponding labels to the color in the bar in a histogram. Here is a reproducible code.

ggplot(aes(displ),data =mpg) + geom_histogram(aes(fill=class),binwidth = 1,col="black")

This code gives a histogram and give different colors for the car "class" for the histogram bars. But is there any way I can add the labels of the "class" inside corresponding colors in the graph?

解决方案

The inbuilt functions geom_histogram and stat_bin are perfect for quickly building plots in ggplot. However, if you are looking to do more advanced styling it is often required to create the data before you build the plot. In your case you have overlapping labels which are visually messy.

The following codes builds a binned frequency table for the dataframe:

# Subset data
mpg_df <- data.frame(displ = mpg$displ, class = mpg$class)
melt(table(mpg_df[, c("displ", "class")]))

# Bin Data
breaks <- 1
cuts <- seq(0.5, 8, breaks)
mpg_df$bin <- .bincode(mpg_df$displ, cuts)

# Count the data
mpg_df <- ddply(mpg_df, .(mpg_df$class, mpg_df$bin), nrow)
names(mpg_df) <- c("class", "bin", "Freq")

You can use this new table to set a conditional label, so boxes are only labelled if there are more than a certain number of observations:

ggplot(mpg_df, aes(x = bin, y = Freq,  fill = class)) +
  geom_bar(stat = "identity", colour = "black", width = 1) +
  geom_text(aes(label=ifelse(Freq >= 4, as.character(class), "")),
   position=position_stack(vjust=0.5), colour="black")

I don't think it makes a lot of sense duplicating the labels, but it may be more useful showing the frequency of each group:

ggplot(mpg_df, aes(x = bin, y = Freq,  fill = class)) +
  geom_bar(stat = "identity", colour = "black", width = 1) +
  geom_text(aes(label=ifelse(Freq >= 4, Freq, "")),
   position=position_stack(vjust=0.5), colour="black")

Update

I realised you can actually selectively filter a label using the internal ggplot function ..count... No need to preformat the data!

ggplot(mpg, aes(x = displ, fill = class, label = class)) +
  geom_histogram(binwidth = 1,col="black") +
  stat_bin(binwidth=1, geom="text", position=position_stack(vjust=0.5), aes(label=ifelse(..count..>4, ..count.., "")))

This post is useful for explaining special variables within ggplot: Special variables in ggplot (..count.., ..density.., etc.)

This second approach will only work if you want to label the dataset with the counts. If you want to label the dataset by the class or another parameter, you will have to prebuild the data frame using the first method.

这篇关于如何在ggplot中标记堆积的直方图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆