将带有元数据的列添加到geom_tile ggplot [英] Adding columns with metadata to geom_tile ggplot

查看:44
本文介绍了将带有元数据的列添加到geom_tile ggplot的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有以下数据

id <- 1:80
gyrA <- sample(c(1,0), 80, replace = TRUE)
parC <- sample(c(1,0), 80, replace = TRUE)
marR <- sample(c(1,0), 80, replace = TRUE)
qnrS <- sample(c(1,0), 80, replace = TRUE)
marA <- sample(c(1,0), 80, replace = TRUE)
ydhE <- sample(c(1,0), 80, replace = TRUE)
qnrA <- sample(c(1,0), 80, replace = TRUE)
qnrB <- sample(c(1,0), 80, replace = TRUE)
qnrD <- sample(c(1,0), 80, replace = TRUE)
mcbE <- sample(c(1,0), 80, replace = TRUE)
oqxAB <- sample(c(1,0), 80, replace = TRUE)
species <- sample(c("Wild bird","Pig","Red Fox","Broiler"), 80, replace = TRUE)

test_data <- data.frame(id,species,gyrA,parC,marR,marA,qnrS,qnrA,qnrB,qnrD,ydhE,mcbE,oqxAB)


library(dplyr)

plot_data <- test_data %>%
  gather(key = "gene", value = "value", -id) %>%
  mutate(id = factor(id, levels = unique(id)),
         gene = factor(gene, levels = unique(gene)))

我想创建一个热图,其中包含数据中基因的存在/不存在.但是,我还希望在同一图中添加一个包含物种的列.我将所有存在/不存在列(gyrA,parC等)收集到一个列中.

I want to create a heatmap with presence/absence of the genes in the data. However, I also want a column with the species in the same plot. I gathered all the presence/absence columns (gyrA, parC etc.) into one column.

我设法创建了热图,但没有包括物种.最好是,我想添加包含以后可能与这些样本相关的任何数据的列.

I have managed to create the heatmap, but not with species included. Preferrably i want to add columns with any data I might get later on related to these samples.

剧情:

ggplot(plot_data, aes(gene, id, fill = value))+
  geom_tile(color = "black")+
  theme_classic()

如何在图中添加带有物种的列,使其看起来像这样?

How do I add a column with species to the plot, so that it looks like this?

有没有简单的方法可以做到这一点?如果更容易,是否可以至少创建一列文字来说明每一行代表哪种物种?

Is there any simple way to do this? If easier, is it possible to at least create a column with text that says which species is represented at each row?

推荐答案

编辑

根据他/她的评论,我对示例数据进行了调整,以反映出OP的实际问题.

Based on his/her comment, I have adapted the sample data to reflect the actual question of the OP.

colors <- c("#b13da1", "#00b551" , "#fff723" , "#ff0022")

plot_data$label <- paste("1 -", as.character(plot_data$species))
plot_data$label[plot_data$value==0] <- "0"

ggplot(plot_data, aes(gene, id, fill = label))+
  geom_tile(color = "black")+
  theme_classic()+
  scale_fill_manual(values = c("white", colors), "Value")+
  theme(
    axis.line = element_blank(),
    axis.ticks = element_blank()) +
  xlab("Gene") + ylab("id")

具有聚类的 species 以提高可读性:

With clustered species for readability:

library(forcats)

ggplot(plot_data, aes(gene, fct_reorder(id, as.numeric(species)), fill = label))+
  geom_tile(color = "black")+
  theme_classic()+
  scale_fill_manual(values = c("white", colors), "Value")+
  theme(
    axis.line = element_blank(),
    axis.ticks = element_blank()) +
  xlab("Gene") + ylab("id")

使用一些变通方法有些接近OP想要的东西(但是我认为结果数字不如第一个清晰.)

Something a bit closer to what the OP would like using some workarounds (but I think the resulting figure is less clear than the first one).

newdata <- plot_data[1:10,]
newdata$gene <- "Species"
newdata$value <- newdata$species
plot_data <- rbind(plot_data, newdata)

plot_data$value <- as.factor(plot_data$value)
levels(plot_data$value) <- c(levels(plot_data$value ), "") # add artificial levels to split the legend into 2 columns
levels(plot_data$value) <- c(levels(plot_data$value ), " ") 
plot_data$value <- factor(plot_data$value, levels(plot_data$value)[c(1,2,7,8,3:6)])
plot_data$gene <- factor(plot_data$gene, levels(plot_data$gene)[c(12, 1:11)])

colors <- c("#b13da1", "#00b551" , "#fff723" , "#ff0022")

ggplot(plot_data, aes(gene, id, fill = value))+
  geom_tile()+
  geom_tile(color = "black",show.legend = F)+
  theme_classic()+
  scale_fill_manual(values = c("#403f3f", "grey","white","white", 
  colors), "Value Species", drop=FALSE)+
  theme(
    axis.line = element_blank(),
    axis.ticks = element_blank()) +
  guides(fill = guide_legend(ncol=2)) +
  xlab("Gene") + ylab("id")+
  scale_x_discrete(position = "top") 

样本数据

test_data <- test_data[1:10,]

library(dplyr)

plot_data <- test_data %>%
  gather(key = "gene", value = "value", -c(id, species)) %>%
  mutate(id = factor(id, levels = 1:10),
         gene = factor(gene, levels = unique(gene)),
         value = factor(value, levels = c(1,0)))

这篇关于将带有元数据的列添加到geom_tile ggplot的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆