ggplot boxplot上的Tukeys post-hoc [英] Tukeys post-hoc on ggplot boxplot

查看:216
本文介绍了ggplot boxplot上的Tukeys post-hoc的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

好的,所以我认为我已经很接近了,但是当我尝试在最后构造箱形图时遇到了一个错误。我的目标是在每个箱形图上方的时间点之间放置表示统计关系的字母。我已经在这个站点上看到了两次讨论,可以从他们的代码中复制结果,但是不能将其应用于我的数据集。



包装

 库(ggplot2)
库(multcompView)
库(plyr)

这是我的数据:

  dput (WaterConDryMass)
结构(list(ChillTime = structure(c(1L,1L,1L,1L,2L,2L,
2L,2L,3L,3L,3L,3L,4L,4L,4L ,4L,5L,5L,5L,5L),.Label = c( Pre_chill,
6, 13, 24, Post_chill),class = factor),dmass = c(0.22,
0.19,0.34,0.12,0.23,0.33,0.38,0.15,0.31,0.34,0.45,0.48,
0.59,0.54,0.73,0.69,0.53,0.57,0.39,0.8 )),.Names = c( ChillTime,
dmass),row.names = c(NA,-20L),class = data.frame)

方差分析和Tukey Post-hoc

  Model4<-aov(dmass〜ChillTime,data = WaterConDryMass)
tHSD<-TukeyHSD(Model4,ordered = FALSE,conf.level = 0.95)
plot(tHSD,las = 1,col = brown)

功能:

  generate_label_df<-函数(TUKEY,flev){

#从Tukey post-中提取标签和因子水平hoc
Tukey.levels<-TUKEY [[flev]] [,4]
Tukey.labels<-multcompLetters(Tukey.levels)['Letters']
plot.labels< ;-names(Tukey.labels [['Letters']])

boxplot.df<-ddply(WaterConDryMass,flev,function(x)max(fivenum(x $ y))+ 0.2 )

#在因子水平和Tukey的同质组字母
plot.levels<-data.frame(plot.labels,labels = Tukey.labels [['字母']],
stringsAsFactors = FALSE)

#与标签
labels.df<-merge(plot.levels,boxplot.df,by.x ='plot.labels',by.y = flev,sort = FALSE)
return(labels.df)
}

箱线图:

  ggplot(WaterConDryMas s,aes(x = ChillTime,y = dmass))+ 
geom_blank()+
theme_bw()+
theme(panel.grid.major = element_blank(),panel.grid。次要= element_blank())+
个实验室(x ='时间(周)',y ='水含量(DM%)')+
ggtitle(expression(atop(bold( Water Content ),atop(italic((Dry Mass)),)))))+
主题(plot.title = element_text(just = 0.5,face ='bold'))+
批注(geom = rect,xmin = 1.5,xmax = 4.5,ymin = -Inf,ymax = Inf,alpha = 0.6,fill = grey90)+
geom_boxplot(fill ='green2',stat = boxplot)+
geom_text(数据= generate_label_df(tHSD),aes(x = plot.labels,y = V1,label =标签))+
geom_vline(aes(xintercept = 4.5),linetype = 虚线)+
主题(plot.title = element_text(vjust = -0.6))

错误:

  HSD错误[[flev]]:下标类型'symbol'


解决方案

我想我找到了您正在关注的教程,或者说som举止非常相似。您可能最好将整个内容复制并粘贴到您的工作空间,功能以及所有内容中,以避免遗漏一些小差异。



基本上,我已按照该教程进行操作(


Ok, so I think I'm pretty close with this, but I'm getting an error when I try to construct my box plot at the end. My goal is to place letters denoting statistical relationships among the time points above each boxplot. I've seen two discussion of this on this site, and can reproduce the results from their code, but can't apply it to my dataset.

Packages

library(ggplot2)
library(multcompView)
library(plyr)

Here is my data:

dput(WaterConDryMass)
structure(list(ChillTime = structure(c(1L, 1L, 1L, 1L, 2L, 2L, 
2L, 2L, 3L, 3L, 3L, 3L, 4L, 4L, 4L, 4L, 5L, 5L, 5L, 5L), .Label = c("Pre_chill", 
"6", "13", "24", "Post_chill"), class = "factor"), dmass = c(0.22, 
0.19, 0.34, 0.12, 0.23, 0.33, 0.38, 0.15, 0.31, 0.34, 0.45, 0.48, 
0.59, 0.54, 0.73, 0.69, 0.53, 0.57, 0.39, 0.8)), .Names = c("ChillTime", 
"dmass"), row.names = c(NA, -20L), class = "data.frame")

ANOVA and Tukey Post-hoc

Model4 <- aov(dmass~ChillTime, data=WaterConDryMass)
tHSD <- TukeyHSD(Model4, ordered = FALSE, conf.level = 0.95)
plot(tHSD , las=1 , col="brown" )

Function:

generate_label_df <- function(TUKEY, flev){

  # Extract labels and factor levels from Tukey post-hoc 
  Tukey.levels <- TUKEY[[flev]][,4]
  Tukey.labels <- multcompLetters(Tukey.levels)['Letters']
  plot.labels <- names(Tukey.labels[['Letters']])

  boxplot.df <- ddply(WaterConDryMass, flev, function (x) max(fivenum(x$y)) + 0.2)

  # Create a data frame out of the factor levels and Tukey's homogenous group letters
  plot.levels <- data.frame(plot.labels, labels = Tukey.labels[['Letters']],
                            stringsAsFactors = FALSE) 

  # Merge it with the labels
  labels.df <- merge(plot.levels, boxplot.df, by.x = 'plot.labels', by.y = flev, sort = FALSE)
  return(labels.df)
}  

Boxplot:

ggplot(WaterConDryMass, aes(x = ChillTime, y = dmass)) +
  geom_blank() +
  theme_bw() +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) +
  labs(x = 'Time (weeks)', y = 'Water Content (DM %)') +
  ggtitle(expression(atop(bold("Water Content"), atop(italic("(Dry Mass)"), "")))) +
  theme(plot.title = element_text(hjust = 0.5, face='bold')) +
  annotate(geom = "rect", xmin = 1.5, xmax = 4.5, ymin = -Inf, ymax = Inf, alpha = 0.6, fill = "grey90") +
  geom_boxplot(fill = 'green2', stat = "boxplot") +
  geom_text(data = generate_label_df(tHSD), aes(x = plot.labels, y = V1, label = labels)) +
  geom_vline(aes(xintercept=4.5), linetype="dashed") +
  theme(plot.title = element_text(vjust=-0.6))

Error:

Error in HSD[[flev]] : invalid subscript type 'symbol'

解决方案

I think I found the tutorial you are following, or something very similar. You would probably be best to copy and paste this whole thing into your work space, function and all, to avoid missing a few small differences.

Basically I have followed the tutorial (http://www.r-graph-gallery.com/84-tukey-test/) to the letter and added a few necessary tweaks at the end. It adds a few extra lines of code, but it works.

generate_label_df <- function(TUKEY, variable){

  # Extract labels and factor levels from Tukey post-hoc 
  Tukey.levels <- TUKEY[[variable]][,4]
  Tukey.labels <- data.frame(multcompLetters(Tukey.levels)['Letters'])

  #I need to put the labels in the same order as in the boxplot :
  Tukey.labels$treatment=rownames(Tukey.labels)
  Tukey.labels=Tukey.labels[order(Tukey.labels$treatment) , ]
  return(Tukey.labels)
}

model=lm(WaterConDryMass$dmass~WaterConDryMass$ChillTime )
ANOVA=aov(model)

# Tukey test to study each pair of treatment :
TUKEY <- TukeyHSD(x=ANOVA, 'WaterConDryMass$ChillTime', conf.level=0.95)

labels<-generate_label_df(TUKEY , "WaterConDryMass$ChillTime")#generate labels using function

names(labels)<-c('Letters','ChillTime')#rename columns for merging

yvalue<-aggregate(.~ChillTime, data=WaterConDryMass, mean)# obtain letter position for y axis using means

final<-merge(labels,yvalue) #merge dataframes

ggplot(WaterConDryMass, aes(x = ChillTime, y = dmass)) +
  geom_blank() +
  theme_bw() +
  theme(panel.grid.major = element_blank(), panel.grid.minor = element_blank()) +
  labs(x = 'Time (weeks)', y = 'Water Content (DM %)') +
  ggtitle(expression(atop(bold("Water Content"), atop(italic("(Dry Mass)"), "")))) +
  theme(plot.title = element_text(hjust = 0.5, face='bold')) +
  annotate(geom = "rect", xmin = 1.5, xmax = 4.5, ymin = -Inf, ymax = Inf, alpha = 0.6, fill = "grey90") +
  geom_boxplot(fill = 'green2', stat = "boxplot") +
  geom_text(data = final, aes(x = ChillTime, y = dmass, label = Letters),vjust=-3.5,hjust=-.5) +
  geom_vline(aes(xintercept=4.5), linetype="dashed") +
  theme(plot.title = element_text(vjust=-0.6))

这篇关于ggplot boxplot上的Tukeys post-hoc的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆