将原始数据叠加到geom_bar上 [英] Overlay raw data onto geom_bar

查看:96
本文介绍了将原始数据叠加到geom_bar上的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个数据帧,排列如下:

I have a data-frame arranged as follows:

condition,treatment,value
A        ,  one    , 2
A        ,  one    , 1
A        ,  two    , 4
A        ,  two    , 2
...
D        ,  two    , 3

我已经使用ggplot2制作了一个看起来像这样的分组条形图:

I have used ggplot2 to make a grouped bar plot that looks like this:

这些条按条件"分组,颜色表示处理".条高是每个条件/处理对的平均值.我通过创建一个新的数据框来实现这一点,该数据框包含组成每个组的所有点的均值和标准误差(对于误差线).

The bars are grouped by "condition" and the colours indicate "treatment." The bar heights are the mean of the values for each condition/treatment pair. I achieved this by creating a new data frame containing the mean and standard error (for the error bars) for all the points that will make up each group.

我想做的是叠加原始的 jittered 数据,以生成该箱形图的条形图版本:

What I would like to do is superimpose the raw jittered data to produce a bar-chart version of this box plot: http://docs.ggplot2.org/0.9.3.1/geom_boxplot-6.png [I realise that a box plot would probably be better, but my hands are tied because the client is pathologically attached to bar charts]

我尝试将geom_point对象添加到绘图中并向其提供原始数据(而不是用于制作条形的聚合平均值).这种工作,但是它在错误的x轴位置绘制原始值.它们出现在红色和灰色条连接的点处,而不是出现在相应条的中心处.所以我的情节看起来像这样:

I have tried adding a geom_point object to my plot and feeding it the raw data (rather than the aggregated means which were used to make the bars). This sort of works, but it plots the raw values at the wrong x axis locations. They appear at the points at which the red and grey bars join, rather than at the centres of the appropriate bar. So my plot looks like this:

我无法弄清楚如何将点移动固定的数量,然后抖动它们以使它们居中放置在正确的小节上.有人知道吗也许有更好的方法来实现我想要做的事情?

I can not figure out how to shift the points by a fixed amount and then jitter them in order to get them centered over the correct bar. Anyone know? Is there, perhaps, a better way of achieving what I'm trying to do?

以下是显示我所遇到的问题的一个最小示例:

What follows is a minimal example that shows the problem I have:

#Make some fake data
ex=data.frame(cond=rep(c('a','b','c','d'),each=8),
    treat=rep(rep(c('one','two'),4),each=4),
    value=rnorm(32) + rep(c(3,1,4,2),each=4) )

#Calculate the mean and SD of each condition/treatment pair
agg=aggregate(value~cond*treat, data=ex, FUN="mean") #mean
agg$sd=aggregate(value~cond*treat, data=ex, FUN="sd")$value #add the SD 


dodge <- position_dodge(width=0.9) 
limits <- aes(ymax=value+sd, ymin=value-sd) #Set up the error bars

p <- ggplot(agg, aes(fill=treat, y=value, x=cond)) 

#Plot, attempting to overlay the raw data
print(
       p + geom_bar(position=dodge, stat="identity") +
       geom_errorbar(limits, position=dodge, width=0.25) + 
       geom_point(data= ex[ex$treat=='one',], colour="green", size=3) +
       geom_point(data= ex[ex$treat=='two',], colour="pink", size=3)
)

推荐答案

只需调用一次geom_point(),即可在其中使用数据框ex并将x的值设置为cond,将y的值设置为valuecolor=treat(在aes()内部).然后添加position=dodge以确保点是dodgeg.使用scale_color_manual()和参数values=,您可以设置所需的颜色.

You need just one call to geom_point() where you use data frame ex and set x values to cond, y values to value and color=treat (inside aes()). Then add position=dodge to ensure that points are dodgeg. With scale_color_manual() and argument values= you can set colors you need.

    p+geom_bar(position=dodge, stat="identity") +
      geom_errorbar(limits, position=dodge, width=0.25)+
      geom_point(data=ex,aes(cond,value,color=treat),position=dodge)+
      scale_color_manual(values=c("green","pink"))

您不能直接将位置dodgejitter一起使用.但是,有一些解决方法.如果将整个图另存为对象,则使用ggplot_build()可以看到条的x位置-在这种情况下,它们是0.775、1.225、1.775 ...这些位置对应于因子condtreat的组合.就像在数据帧ex中,每种组合都有4个值,然后添加包含重复4次的x个位置的新列.

You can't directly use positions dodge and jitter together. But there are some workarounds. If you save whole plot as object then with ggplot_build() you can see x positions for bars - in this case they are 0.775, 1.225, 1.775... Those positions correspond to combinations of factors cond and treat. As in data frame ex there are 4 values for each combination, then add new column that contains those x positions repeated 4 times.

ex$xcord<-rep(c(0.775,1.225,1.775,2.225,2.775,3.225,3.775,4.225),each=4)

geom_point()中的现在使用此新列作为x值并将位置设置为jitter.

Now in geom_point() use this new column as x values and set position to jitter.

p+geom_bar(position=dodge, stat="identity") +
  geom_errorbar(limits, position=dodge, width=0.25)+
  geom_point(data=ex,aes(xcord,value,color=treat),position=position_jitter(width =.15))+
  scale_color_manual(values=c("green","pink"))

这篇关于将原始数据叠加到geom_bar上的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆