在ggplot2中为每个组添加geom_rug之类的箱线图 [英] Add geom_rug like boxplots per group in ggplot2

查看:153
本文介绍了在ggplot2中为每个组添加geom_rug之类的箱线图的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想每组在我的密度图的底部和顶部添加一个地毯箱图.我找不到实现,因此尝试尝试手动创建箱形图,然后将带有注解_自定义的图添加到图中.

I'd like to add a rug boxplot per group to the bottom and top of my density plot. I could not find an implementation, so I tried to manually create the boxplots and then add those with annotation_custom to the plot.

当前存在密度图的x轴与框线图不对齐的问题.我试图提取第一个图的界限,但只能找到一种方法来提取数据的界限.

Currently there is the problem that x axes of the density plot and the boxplots do not align. I tried to extract the limits of the first plot, but could only find a way to extract the limits of the data.

第二个问题是箱形图的精确y对齐,这应该与geom_rug处理它的相同.

The second problem is the exact y alignment of the boxplots, this should be the same as geom_rug handles it.

第三个问题是要确保密度图和箱形图使用相同的填充色.我使用手动方法解决了这个问题,但是如果不必在多个位置指定颜色,显然它将更加通用.

The third problem is to ensure that the same fill colors are used by the density and boxplots. I used a manual approach to solve this, but clearly it would be a lot more general if I do not have to specify the color in multiple places.

set.seed(123)
library(ggplot2)
library(ggpubr)
library(data.table)
Data <- data.table(x = rnorm(100),
                   group = rep(c("group1", "group2"), times = c(30, 70)))

# Colors for groups
colors <- c("group1" = "#66C2A5", "group2" = "#FC8D62")

p <-
  ggplot(Data, aes(x = x, fill = group, color = group)) +
  geom_density(alpha = 0.5) +
  scale_color_manual(values = colors) +
  scale_fill_manual(values = colors)

# Rugs
p +
  geom_rug(data = Data[group %in% "group1"]) +
  geom_rug(data = Data[group %in% "group2"], sides = "t")

#-----

# Boxplots
boxplot1 <-
  ggplot(Data[group %in% "group1"]) +
  geom_boxplot(aes(y = x), fill = colors[["group1"]]) +
  coord_flip() +
  theme_transparent()

boxplot2 <-
  ggplot(Data[group %in% "group2"]) +
  geom_boxplot(aes(y = x), fill = colors[["group2"]]) +
  coord_flip() +
  theme_transparent()

boxplot1_grob <- ggplotGrob(boxplot1)
boxplot2_grob <- ggplotGrob(boxplot2)

# Place box plots inside density plot
x <- ggplot_build(p)$layout$panel_scales_x[[1]]$range$range
xmin <- x[1]
xmax <- x[2]
y <- ggplot_build(p)$layout$panel_scales_y[[1]]$range$range
ymin <- y[1]
ymax <- y[2]

yoffset <- (1/28) * ymax
xoffset <- (1/28) * xmax

# Add boxplots with annotation_custom
p2 <- p +
  annotation_custom(grob = boxplot1_grob, xmin = xmin, xmax = xmax,
                    ymin = ymin - yoffset, ymax = ymin + yoffset) +
  annotation_custom(grob = boxplot2_grob,
                    xmin = xmin, xmax = xmax,
                    ymin = ymax - yoffset, ymax = ymax + yoffset)

p2

# Alignment is not correct
p2 +
  geom_rug(data = Data[group %in% "group1"]) +
  geom_rug(data = Data[group %in% "group2"], sides = "t")

推荐答案

有时候我做了一些类似的练习,尚未对其进行严格的测试,但它确实适用于您的用例.如果有任何问题,请通知我&我看看是否可以修复它们:

I made something similar for practice sometime back, & have yet to test it rigorously, but it does seem to work for your use case. If anything breaks, let me know & I'll see if I can fix them:

# with boxplots only
p +
  geom_marginboxplot(data = Data[Data$group %in% "group1", ], 
                     aes(y = 1), sides = "b") +
  geom_marginboxplot(data = Data[Data$group %in% "group2", ], 
                     aes(y = 1), sides = "t")

# with both boxplots & geom_rug (check that they align exactly)
p +
  geom_marginboxplot(data = Data[Data$group %in% "group1", ], 
                     aes(y = 1), sides = "b") +
  geom_marginboxplot(data = Data[Data$group %in% "group2", ], 
                     aes(y = 1), sides = "t") +
  geom_rug(data = Data[group %in% "group1"]) +
  geom_rug(data = Data[group %in% "group2"], sides = "t")

边际箱线图的尺寸模仿geom_rug的尺寸,占图板高度/宽度的3%. x& y必须在aes()中进行映射,尽管在这种情况下实际上并不需要y,因此我为它分配了值1作为占位符.

The marginal boxplot's dimensions imitate those of geom_rug, occupying 3% of the plot panel's height / width. Both x & y have to be mapped in aes(), though in this case y isn't actually needed, so I assigned it the value 1 as a placeholder.

运行以下命令以获得geom_marginboxplot:

library(ggplot2)
library(grid)

`%||%` <- function (x, y)  if (is.null(x))  y else x

geom_marginboxplot <- function(mapping = NULL, data = NULL,
                         ...,
                         sides = "bl",
                         outlier.shape = 16,
                         outlier.size = 1.5,
                         outlier.stroke = 0.5,
                         width = 0.9,
                         na.rm = FALSE,
                         show.legend = NA,
                         inherit.aes = TRUE) {

  layer(
    data = data,
    mapping = mapping,
    stat = StatMarginBoxplot,
    geom = GeomMarginBoxplot,
    position = "identity",
    show.legend = show.legend,
    inherit.aes = inherit.aes,
    params = list(
      sides = sides,
      outlier.shape = outlier.shape,
      outlier.size = outlier.size,
      outlier.stroke = outlier.stroke,
      width = width,
      notch = FALSE,
      notchwidth = 0.5,
      varwidth = FALSE,
      na.rm = na.rm,
      ...
    )
  )
}

StatMarginBoxplot <- ggproto(
  "StatMarginBoxplot", Stat,
  optional_aes = c("x", "y"),
  non_missing_aes = "weight",

  setup_data = function(data, params, 
                        sides = "bl") {
    if(grepl("l|r", sides)){
      data.vertical <- data
      data.vertical$orientation <- "vertical"
    } else data.vertical <- data.frame()
    if(grepl("b|t", sides)){
      data.horizontal <- data
      data.horizontal$y <- data.horizontal$x
      data.horizontal$orientation <- "horizontal"
    } else data.horizontal <- data.frame()
    data <- remove_missing(rbind(data.vertical, 
                                 data.horizontal),
                           na.rm = FALSE, vars = "x", 
                           "stat_boxplot")
    data
  },

  compute_group = function(data, scales, sides = "bl", 
                           width = 0.9, na.rm = FALSE, coef = 1.5){

    if(grepl("l|r", sides)){
      df.vertical <- do.call(environment(StatBoxplot$compute_group)$f,
                             args = list(data = data[data$orientation == "vertical", ], 
                                         scales = scales, width = width,
                                         na.rm = na.rm, coef = coef))
      df.vertical <- df.vertical[, c("ymin", "lower", "middle", "upper", "ymax", "outliers")]
      df.vertical$orientation = "vertical"
    } else df.vertical <- data.frame()
    if(grepl("b|t", sides)){
      df.horizontal <- do.call(environment(StatBoxplot$compute_group)$f,
                               args = list(data = data[data$orientation == "horizontal", ], 
                                           scales = scales, width = width,
                                           na.rm = na.rm, coef = coef))
      df.horizontal <- df.horizontal[, c("ymin", "lower", "middle", "upper", "ymax", "outliers")]
      df.horizontal$orientation = "horizontal"
    } else df.horizontal <- data.frame()

    df <- rbind(df.vertical, df.horizontal)

    colnames(df) <- gsub("^y", "", colnames(df))
    df
  }
)

GeomMarginBoxplot <- ggproto(
  "GeomMarginBoxplot", Geom,

  setup_data = function(data, params, sides = "bl") {

    data.vertical <- data[data$orientation == "vertical", ]
    if(nrow(data.vertical) > 0) {
      colnames(data.vertical)[1:6] <- paste0("y", colnames(data.vertical)[1:6])
    } 
    data.horizontal <- data[data$orientation == "horizontal", ]
    if(nrow(data.horizontal) > 0){
      colnames(data.horizontal)[1:6] <- paste0("x", colnames(data.horizontal)[1:6])
    }
    data <- merge(data.vertical, data.horizontal, all = TRUE)
    data <- data[, sapply(data, function(x) !all(is.na(x)))]
    data
  },

  draw_group = function(data, panel_params, coord, fatten = 2,
                        outlier.shape = 19, outlier.stroke = 0.5,
                        outlier.size = 1.5, width = 0.9,
                        notch = FALSE, notchwidth = 0.5, varwidth = FALSE,
                        sides = "bl") {

    draw.marginal.box <- function(sides){

      if(sides %in% c("l", "b")){
        pos1 <- unit(0, "npc"); pos2 <- unit(0.03, "npc")
      } else {
        pos2 <- unit(0.97, "npc"); pos1 <- unit(1, "npc")
      }
      if(width > 0 & width < 1){
        increment <- (1 - width) / 2
        increment <- increment * (pos2 - pos1)
        pos1 <- pos1 + increment
        pos2 <- pos2 - increment
      }
      pos3 <- 0.5 * pos1 + 0.5 * pos2

      outliers_grob <- NULL

      if(sides %in% c("l", "r")) {
        data <- data[data$orientation == "vertical", ]

        if (!is.null(data$youtliers) && length(data$youtliers[[1]] >= 1)) {

          outliers <- data.frame(
            y = unlist(data$youtliers[[1]]),
            x = 0,
            colour = data$colour[1],
            fill = data$fill[1],
            shape = outlier.shape %||% data$shape[1],
            size = outlier.size %||% data$size[1],
            stroke = outlier.stroke %||% data$stroke[1],
            alpha = data$alpha[1],
            stringsAsFactors = FALSE
          )

          coords <- coord$transform(outliers, panel_params)

          x.pos <- rep(pos3, nrow(coords))
          y.pos <- unit(coords$y, "native")

          outliers_grob <- pointsGrob(
            x = x.pos, y = y.pos,
            pch = coords$shape,
            gp = gpar(col = coords$colour, 
                      fill = alpha(coords$fill, coords$alpha), 
                      fontsize = coords$size * .pt + coords$stroke * .stroke/2, 
                      lwd = coords$stroke * .stroke/2))
        }

        box.whiskers <- data.frame(
          y = c(data$ymin, data$ylower, data$ymiddle, data$yupper, data$ymax),
          x = 0,
          colour = data$colour[1],
          fill = data$fill[1],
          size = data$size[1],
          alpha = data$alpha[1],
          stringsAsFactors = FALSE
        )

        box.whiskers <- coord$transform(box.whiskers, panel_params)

        whiskers_grob <- segmentsGrob(
          x0 = rep(pos3, 2),
          x1 = rep(pos3, 2),
          y0 = unit(c(box.whiskers$y[1], box.whiskers$y[5]), "native"),
          y1 = unit(c(box.whiskers$y[2], box.whiskers$y[4]), "native"),
          gp = gpar(col = box.whiskers$colour,
                    lwd = box.whiskers$size * .pt,
                    lty = box.whiskers$linetype))

        box_grob <- rectGrob(
          x = pos1,
          y = unit(box.whiskers$y[4], "native"),
          width = pos2 - pos1,
          height = unit(box.whiskers$y[4] - box.whiskers$y[2], "native"),
          just = c("left", "top"),
          gp = gpar(col = box.whiskers$colour,
                    fill = alpha(box.whiskers$fill, box.whiskers$alpha),
                    lwd = box.whiskers$size * .pt,
                    lty = box.whiskers$linetype))

        median_grob <- segmentsGrob(
          x0 = rep(pos1, 2),
          x1 = rep(pos2, 2),
          y0 = unit(box.whiskers$y[3], "native"),
          y1 = unit(box.whiskers$y[3], "native"),
          gp = gpar(col = box.whiskers$colour,
                    lwd = box.whiskers$size * .pt,
                    lty = box.whiskers$linetype))
      } 

      if(sides %in% c("b", "t")) {
        data <- data[data$orientation == "horizontal", ]

        if (!is.null(data$xoutliers) && length(data$xoutliers[[1]] >= 1)) {

          outliers <- data.frame(
            x = unlist(data$xoutliers[[1]]),
            y = 0,
            colour = data$colour[1],
            fill = data$fill[1],
            shape = outlier.shape %||% data$shape[1],
            size = outlier.size %||% data$size[1],
            stroke = outlier.stroke %||% data$stroke[1],
            alpha = data$alpha[1],
            stringsAsFactors = FALSE
          )

          coords <- coord$transform(outliers, panel_params)

          x.pos <- unit(coords$x, "native")
          y.pos <- rep(pos3, nrow(coords))

          outliers_grob <- pointsGrob(
            x = x.pos, y = y.pos,
            pch = coords$shape,
            gp = gpar(col = coords$colour, 
                      fill = alpha(coords$fill, coords$alpha), 
                      fontsize = coords$size * .pt + coords$stroke * .stroke/2, 
                      lwd = coords$stroke * .stroke/2))
        }

        box.whiskers <- data.frame(
          x = c(data$xmin, data$xlower, data$xmiddle, data$xupper, data$xmax),
          y = 0,
          colour = data$colour[1],
          fill = data$fill[1],
          size = data$size[1],
          alpha = data$alpha[1],
          stringsAsFactors = FALSE
        )

        box.whiskers <- coord$transform(box.whiskers, panel_params)

        whiskers_grob <- segmentsGrob(
          y0 = rep(pos3, 2),
          y1 = rep(pos3, 2),
          x0 = unit(c(box.whiskers$x[1], box.whiskers$x[5]), "native"),
          x1 = unit(c(box.whiskers$x[2], box.whiskers$x[4]), "native"),
          gp = gpar(col = box.whiskers$colour,
                    lwd = box.whiskers$size * .pt,
                    lty = box.whiskers$linetype))

        box_grob <- rectGrob(
          y = pos2,
          x = unit(box.whiskers$x[2], "native"),
          height = pos2 - pos1,
          width = unit(box.whiskers$x[4] - box.whiskers$x[2], "native"),
          just = c("left", "top"),
          gp = gpar(col = box.whiskers$colour,
                    fill = alpha(box.whiskers$fill, box.whiskers$alpha),
                    lwd = box.whiskers$size * .pt,
                    lty = box.whiskers$linetype))

        median_grob <- segmentsGrob(
          y0 = rep(pos1, 2),
          y1 = rep(pos2, 2),
          x0 = unit(box.whiskers$x[3], "native"),
          x1 = unit(box.whiskers$x[3], "native"),
          gp = gpar(col = box.whiskers$colour,
                    lwd = box.whiskers$size * .pt,
                    lty = box.whiskers$linetype))
      }

      grobTree(outliers_grob,
               whiskers_grob,
               box_grob,
               median_grob)
    }

    result <- list()

    if(grepl("l", sides)) result$l <- draw.marginal.box("l")
    if(grepl("r", sides)) result$r <- draw.marginal.box("r")
    if(grepl("b", sides)) result$b <- draw.marginal.box("b")
    if(grepl("t", sides)) result$t <- draw.marginal.box("t")

    gTree(children = do.call("gList", result))

  },

  draw_key = draw_key_boxplot,

  default_aes = aes(weight = 1, colour = "grey20", fill = "white", 
                    size = 0.5, stroke = 0.5,
                    alpha = 0.75, shape = 16, linetype = "solid",
                    sides = "bl"),

  optional_aes = c("lower", "upper", "middle", "min", "max")
)

会话信息:R 3.5.1,ggplot2 3.0.0.

Session info: R 3.5.1, ggplot2 3.0.0.

这篇关于在ggplot2中为每个组添加geom_rug之类的箱线图的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆