ggplot上Tukey字母图的各个方面一致的字母 [英] Consistent lettering across facets for Tukey letter plot on ggplot

查看:107
本文介绍了ggplot上Tukey字母图的各个方面一致的字母的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已遵循以下问题的答案:

I have followed the answer from this question: Tukey test results on geom_boxplot with facet_grid

It is great, but what I would like is to compare the facets between them as well. In other words, letter order all of the results first then divide it into facets (I have both horizontal and vertical facets). How can I do this? Also, how can I reorder the letters to start from "a" in the first variable in the first facet, and then "b" the second variable and so on? I tried the following, and it didn't work as I want it for ordering.

TUKEY <- TukeyHSD(ANOVA, ordered = TRUE)

Here is a reproducible code (the code to generate the plots was taken from the link above) and the data is taken from this link (http://sape.inf.usi.ch/quick-reference/ggplot2/facet)

d=expand.grid(obs=0:10, benchmark=c('antlr', 'bloat', 'chart'), gc=c('CopyMS', 'GenCopy', 'GenImmix'), opt=c('on', 'off', 'valid'), heapSize=seq(from=1.5, to=4, by=0.5))
d$time = rexp(nrow(d), 0.01)+1000
d$time = d$time + abs(d$heapSize-3)*100
d$time[d$opt=='on'] = d$time[d$opt=='on']-200

d$time[d$opt=='on' & d$benchmark=='bloat'] = d$time[d$opt=='on' & d$benchmark=='bloat'] + 190

generate_label_df <- function(TUKEY, variable){

  # Extract labels and factor levels from Tukey post-hoc 
  Tukey.levels <- variable[,4]
  Tukey.labels <- data.frame(multcompLetters(Tukey.levels)['Letters'])

  #I need to put the labels in the same order as in the boxplot :
  Tukey.labels$treatment=rownames(Tukey.labels)
  Tukey.labels=Tukey.labels[order(Tukey.labels$treatment) , ]
  return(Tukey.labels)
}


TUKEYplot <- function(df){

  p<-ggplot(data=df)+
    aes(x = opt, y = time, colour = opt) +
    geom_boxplot() +
    facet_grid(gc~benchmark) +
    theme_linedraw() +
    theme(axis.text.x=element_text(angle=45, hjust=1)) +
    ylim(min(df$time),max(df$time)+0.05) +
    labs(x = "type", y= "time", color = "state") +
    theme(strip.background = element_rect(colour = "black", fill = "white")) +
    theme(strip.text = element_text(colour = "black", size=12)) +
    theme(axis.text=element_text(size=12)) +
    theme(legend.text=element_text(size=12)) +
    theme(legend.title=element_text(size=12,face="bold")) +
    theme(axis.title=element_text(size=14,face="bold")) +
    scale_color_npg()
  for (facetk2 in as.character(unique(df$gc))) {   
    for (facetk in as.character(unique(df$benchmark))) {   
      subdf <- subset(df, df$benchmark==facetk & df$gc==facetk2)
      model=lm(time ~ opt, data=subdf)
      ANOVA=aov(model)
      # Tukey test to study each pair of treatment :
      TUKEY <- TukeyHSD(ANOVA)
      print(TUKEY)
      labels <- generate_label_df(TUKEY , TUKEY$`opt`)
      names(labels) <- c('Letters', 'opt')
      yvalue <- aggregate(.~opt, data=subdf, quantile, probs=.75)  
      final <- merge(labels, yvalue)
      final$benchmark <-  facetk
      final$gc <-  facetk2

      p <- p + geom_text(data = final,  aes(x=opt, y=time, label=Letters), 
                         vjust=-1.2, hjust=-.5, show.legend = FALSE, size=5)
    }
  }
  return (p)
}

p1<-TUKEYplot(d)
p1                     



Update: Visual aid of what I would like to do:

Original plot:

Desired plot partially:

解决方案

I finally figured out how to do it, so I am posting the answer! Basically, taking the calculations of Tukey out of the loop, using ANOVA on the interaction and applying Tukey after allowed what I wanted to do. The labels are then separated into columns (make sure your data does not contain ":", you can use revalue if it does), then it is looped over the levels of the data.

TUKEYplot <- function(df){

  p<-ggplot(data=df)+
    aes(x = opt, y = time, colour = opt) +
    geom_boxplot() +
    facet_grid(gc~benchmark) +
    theme_linedraw() +
    theme(axis.text.x=element_text(angle=45, hjust=1)) +
    ylim(min(df$time),max(df$time)+0.05) +
    labs(x = "type", y= "time", color = "state") +
    theme(strip.background = element_rect(colour = "black", fill = "white")) +
    theme(strip.text = element_text(colour = "black", size=12)) +
    theme(axis.text=element_text(size=12)) +
    theme(legend.text=element_text(size=12)) +
    theme(legend.title=element_text(size=12,face="bold")) +
    theme(axis.title=element_text(size=14,face="bold")) +
    scale_color_npg()

  model=lm(time ~ gc*benchmark*opt, data=df)
  ANOVA=aov(model)
  # Tukey test to study each pair of treatment :
  TUKEY <- TukeyHSD(ANOVA)
  all_labels <- generate_label_df(TUKEY , TUKEY$`gc:benchmark:opt`)
  sep_labels<- all_labels %>% separate(col=treatment, into= c("gc", "benchmark", "opt"), sep=":")

  for (facetk2 in as.character(unique(df$gc))) {   
    for (facetk in as.character(unique(df$benchmark))) {   
      subdf <- subset(df, df$benchmark==facetk & df$gc==facetk2)
      labels <- subset(sep_labels, sep_labels$benchmark==facetk & sep_labels$gc==facetk2)
      labels <- subset(labels, select = -c(gc,benchmark))

      names(labels) <- c('Letters', 'opt')
      yvalue <- aggregate(.~opt, data=subdf, quantile, probs=.75)  
      final <- merge(labels, yvalue)
      final$benchmark <-  facetk
      final$gc <-  facetk2

      p <- p + geom_text(data = final,  aes(x=opt, y=time, label=Letters), 
                         vjust=-1.2, hjust=-.5, show.legend = FALSE, size=5)
    }
  }
  return (p)
}

Resulting image: (could not embed the image, because I don't have enough reputation..)

Result

这篇关于ggplot上Tukey字母图的各个方面一致的字母的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆