将堆栈和闪避与ggplot2中的条形图合并 [英] Combine stack and dodge with bar plot in ggplot2

查看:101
本文介绍了将堆栈和闪避与ggplot2中的条形图合并的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正试图在没有可怕的3d条形图和不清楚的x轴(这些是截然不同的时间点,很难分辨它们何时出现)的情况下重新创建该图.

I'm trying to recreate this plot without the horrible 3d bar plot and the unclear x axis (these are distinct timepoints and it's hard to tell when they are).

(摘自Science 291,第5513(2001)号:2606–8 ,否则是一份不错的论文.)

我的第一个直觉是用二维条形图和不同的x轴标签做与他们做的事情类似的事情,使用躲避的条形作为基因型,然后使用堆叠的条形在前条形上进行黑白拆分,但是这里还有其他几个好问题,说你不能那样做.

My first instinct is to do something similar to what they did, with a 2d bar plot and distinct x axis labels, using dodged bars for the genotype and then stacked bars to get the black and white split on the front bar, but several other good questions here say you can't do that.

我的下一个方法是使用构面(下面的代码),该构想相当不错,但是我很乐意看到一种更好的方法.有没有一种方法可以堆叠一些变量并折服其他变量?还是一般而言是更好的方法?

My next approach was to use faceting (code below), which worked reasonably well, but I'd love to see a better way to do this. Is there a way to stack some variables and doge others? or just a better way to do this in general?

为明确起见,我认为显示堆叠的条形的总数(在此情况下为m和n,最初为黑白)非常重要,因为这表示已测量的数量,因此拆分是一个单独的测量.

To clarify, I think that it is important to show the total of the stacked bars (m and n in this case, black and white originally), because this represents a measured quantity, and the split is then a separate measurement.

library(tidyverse)
library(cowplot)

data = tribble(
  ~Timepoint, ~`Ancestral genotype`, ~Mutator, ~`Mean % of auxotrophs`,
  100, 'mutS-', 'o', 10.5,
  150, 'mutS-', 'o', 16,
  220, 'mutS-', 'o', NA,
  300, 'mutS-', 'o', 24.5,
  100, 'mutS+', 'n', 1,
  150, 'mutS+', 'n', NA,
  220, 'mutS+', 'n', 1,
  300, 'mutS+', 'n', 1,
  100, 'mutS+', 'm', 0,
  150, 'mutS+', 'm', NA,
  220, 'mutS+', 'm', 2,
  300, 'mutS+', 'm', 5
)

data <- data %>% mutate(Timepoint = as.character(Timepoint))

data %>% ggplot(aes(x = Timepoint, y = `Mean % of auxotrophs`)) +
  geom_col(aes(fill = Mutator), position = 'stack') + facet_grid(~`Ancestral genotype` ) +
  guides(fill=FALSE)

推荐答案

在我看来,这里的线图更加直观:

It seems to me that a line plot is more intuitive here:

 library(forcats)

 data %>% 
   filter(!is.na(`Mean % of auxotrophs`)) %>%
   ggplot(aes(x = Timepoint, y = `Mean % of auxotrophs`, 
              color = fct_relevel(Mutator, c("o","m","n")), linetype=`Ancestral genotype`)) +
   geom_line() +
   geom_point(size=4) + 
   labs(linetype="Ancestral\ngenotype", colour="Mutator")

要回应您的评论:这是一种通过Ancestral genotype分别堆叠然后躲避每对的简单方法.我们分别为mutS-mutS+绘制堆叠的条形图,并通过在相反方向上少量移动Timepoint来手动避开条形图.将小节width设置为偏移量的两倍将导致成对的小节相互接触.我添加了少量的额外移位(5.5而不是5),以便在每对中的两个小节之间创建少量的空间.

To respond to your comment: Here's a hacky way to stack separately by Ancestral genotype and then dodge each pair. We plot stacked bars separately for mutS- and mutS+, and dodge the bars manually by shifting Timepoint a small amount in opposite directions. Setting the bar width equal twice the shift amount will result in pairs of bars that touch each other. I've added a small amount of extra shift (5.5 instead of 5) to create a tiny amount of space between the two bars in each pair.

 ggplot() +
   geom_col(data=data %>% filter(`Ancestral genotype`=="mutS+"),
            aes(x = Timepoint + 5.5, y = `Mean % of auxotrophs`, fill=Mutator),
            width=10, colour="grey40", size=0.4) + 
   geom_col(data=data %>% filter(`Ancestral genotype`=="mutS-"),
            aes(x = Timepoint - 5.5, y = `Mean % of auxotrophs`, fill=Mutator), 
            width=10, colour="grey40", size=0.4) + 
   scale_fill_discrete(drop=FALSE) +
   scale_y_continuous(limits=c(0,26), expand=c(0,0)) +
   labs(x="Timepoint")

注意:在以上两个示例中,我都将Timepoint保留为数字变量(即,我跳过了将其转换为字符的步骤),以确保x轴及时标定.单位,而不是将其转换为分类轴. 3D图是可憎的,这不仅是由于3D透视图导致的失真,而且还因为它会产生错误的外观,即每次测量都以相同的时间间隔分开.

Note: In both of the examples above, I've kept Timepoint as a numeric variable (i.e., I skipped the step where you converted it to character) in order to ensure that the x-axis is denominated in time units, rather than converting it to a categorical axis. The 3D plot is an abomination, not only because of distortion due to the 3D perspective, but also because it creates a false appearance that each measurement is separated by the same time interval.

这篇关于将堆栈和闪避与ggplot2中的条形图合并的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆