在R中分组的带有错误条的barplot [英] Grouped barplot in R with error bars

查看:146
本文介绍了在R中分组的带有错误条的barplot的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

亲爱的Stackoverflow用户,



我想绘制一个带错误栏的分组barplot。这是我现在能够得到的那种图形,这对我所需要的是正确的:




这里是我的脚本:

  #create dataframe 
基因< -c(Gene1,Gene2,Gene1,Gene2)
count1 < c(12,14,16,34)
count2 <-c(4,7,9,23)
count3 <-c(36,22,54,12)
count4< -c(12,24,35,23)
物种< -c(A,A,B,B)
df< -data.frame(Gene,count1 ,count2,count3,count4,物种)
df

mean1 <-mean(as.numeric(df [1,] [c(2,3,4,5)]))
mean2 <-mean(as.numeric(df [2,] [c(2,3,4,5)]))
mean3 <-mean(as.numeric(df [3,]) [c(2,3,4,5)]))
mean4 <-mean(as.numeric(df [4,] [c(2,3,4,5)]))
Gene1SpeciesA.stdev <-sd(as.numeric(df [1,] [c(2,3,4,5)]))
Gene2SpeciesA.stdev <-sd(as.numeric(df [2,]) [c(2,3,4,5)]))
Gene1SpeciesB.stdev< -sd(as.numeric(df [3,] [c(2,3,4,5)]))
Gene2SpeciesB.stdev< -sd(as .nu​​meric(df [4,] [c(2,3,4,5)]))

ToPlot< -c(mean1,mean2,mean3,mean4)

#plot barplot
plot< -matrix(ToPlot,2,2,byrow = TRUE)#with 2被基因数所取代!
tplot <-t(plot)
BarPlot < - barplot(tplot,beside = TRUE,ylab =count,
names.arg = c(Gene1,Gene2 ),col = c(blue,red))

#add legend
legend(topright,
legend = c(SpeciesA,SpeciesB ),
fill = c(blue,red))

#add错误栏
ee< -matrix(c(Gene1SpeciesA.stdev,Gene2SpeciesA.stdev ,Gene1SpeciesB.stdev,Gene2SpeciesB.stdev),2,2,byrow = TRUE)* 1.96 / sqrt(4)
tee <-t(ee)
error.bar(BarPlot,tplot,tee)

问题是我需要为50个基因和4个物种做这个,所以我的脚本是会得到超级超长,我想这是没有优化...我试图找到帮助这里,但我无法找到更好的方式来做我想做的事。如果我不需要错误栏,我可以使用这个脚本,但棘手的部分是混合ggplot美丽的barlot和错误酒吧! ;)



如果您有任何想法来优化我的脚本,我将非常感激! :



非常感谢!

解决方案

df ,您可以在几行内做到这一点:

  library (ggplot2)

cols = c(2,3,4,5)
df1 = transform(df,mean = rowMeans(df [cols]),sd = apply(df [cols ],1,sd))

#df1看起来像这样
#Gene count1 count2 count3 count4物种平均值sd
#1 Gene1 12 4 36 12 A 16.00 13.856406
#2 Gene2 14 7 22 24 A 16.75 7.804913
#3 Gene1 16 9 54 35 B 28.50 20.240224
#4 Gene2 34 23 12 23 B 23.00 8.981462

ggplot(df1 ,aes(x = as.factor(Gene),y = mean,fill = Species))+
geom_bar(position = position_dodge(),stat =identity,color ='black')+
geom_errorbar(aes(ymin = mean-sd,ymax = mean + sd),width = .2,position = position_dodge(.9))


Dear Stackoverflow users,

I would like to draw a grouped barplot with error bars. Here is the kind of figure I have been able to get up to now, and this is ok for what I need:

And here is my script:

#create dataframe
Gene<-c("Gene1","Gene2","Gene1","Gene2")
count1<-c(12,14,16,34)
count2<-c(4,7,9,23)
count3<-c(36,22,54,12)
count4<-c(12,24,35,23)
Species<-c("A","A","B","B")
df<-data.frame(Gene,count1,count2,count3,count4,Species)
df

mean1<-mean(as.numeric(df[1,][c(2,3,4,5)]))
mean2<-mean(as.numeric(df[2,][c(2,3,4,5)]))
mean3<-mean(as.numeric(df[3,][c(2,3,4,5)]))
mean4<-mean(as.numeric(df[4,][c(2,3,4,5)]))
Gene1SpeciesA.stdev<-sd(as.numeric(df[1,][c(2,3,4,5)]))
Gene2SpeciesA.stdev<-sd(as.numeric(df[2,][c(2,3,4,5)]))
Gene1SpeciesB.stdev<-sd(as.numeric(df[3,][c(2,3,4,5)]))
Gene2SpeciesB.stdev<-sd(as.numeric(df[4,][c(2,3,4,5)]))

ToPlot<-c(mean1,mean2,mean3,mean4)

#plot barplot
plot<-matrix(ToPlot,2,2,byrow=TRUE)   #with 2 being replaced by the number of genes!
tplot<-t(plot)
BarPlot <- barplot(tplot, beside=TRUE,ylab="count",
                names.arg=c("Gene1","Gene2"),col=c("blue","red"))

#add legend
legend("topright", 
       legend = c("SpeciesA","SpeciesB"), 
       fill = c("blue","red"))

#add error bars
ee<-matrix(c(Gene1SpeciesA.stdev,Gene2SpeciesA.stdev,Gene1SpeciesB.stdev,Gene2SpeciesB.stdev),2,2,byrow=TRUE)*1.96/sqrt(4)   
tee<-t(ee)
error.bar(BarPlot,tplot,tee)

The problem is that I need to do this for 50 genes, and 4 species, so my script is gonna get super super long and I guess this is not optimized... I tried to find help here but I can't figure out a better way to do what I'd like. If I did not need error bars I could adapt this script but the tricky part is to mix ggplot beautiful barplots and error bars! ;)

If you have any idea to optimize my script, I would really appreciate! :)

Thanks a lot!

解决方案

Starting from your definition of df, you can do this in a few lines:

library(ggplot2)

cols = c(2,3,4,5)
df1  = transform(df, mean=rowMeans(df[cols]), sd=apply(df[cols],1, sd))

# df1 looks like this
#   Gene count1 count2 count3 count4 Species  mean        sd
#1 Gene1     12      4     36     12       A 16.00 13.856406
#2 Gene2     14      7     22     24       A 16.75  7.804913
#3 Gene1     16      9     54     35       B 28.50 20.240224
#4 Gene2     34     23     12     23       B 23.00  8.981462

ggplot(df1, aes(x=as.factor(Gene), y=mean, fill=Species)) +
  geom_bar(position=position_dodge(), stat="identity", colour='black') +
  geom_errorbar(aes(ymin=mean-sd, ymax=mean+sd), width=.2,position=position_dodge(.9))

这篇关于在R中分组的带有错误条的barplot的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆