ggplot2:更复杂的刻面 [英] ggplot2: More complex faceting

查看:158
本文介绍了ggplot2:更复杂的刻面的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个热图,继续变得越来越复杂。融化数据的一个例子:

$ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ $ b类别子类别变量值
1 A chemosensory family_1005117 caenorhabditis_​​elegans 10
2 A化学感应家庭_1011230 caenorhabditis_​​elegans 4
3 A化学感应家庭_1022539 caenorhabditis_​​elegans 10
4其他家庭_1025293 caenorhabditis_​​elegans NA
5化学感应家庭_1031345 caenorhabditis_​​elegans 10
6 A chemosensory family_1033309 caenorhabditis_​​elegans 10
tail(df2)
类别子类别家庭变量值
6496 C类c类family_455391 trichuris_muris 1
6497 C类c类family_812893 trichuris_muris NA
6498 F类f family_225491 trichuris_muris 1
6499 F class f family_236822 trichuris_muris 1
6500 F class f family_276074 trichuris_muris 1
6501 F class f family_768194 trichuris_muris NA

使用ggplot2和geom_tile,我能够生成一个漂亮的数据热图。

  df2 [df2 = = 0] < -  NA 
df2 [df2> 11] < - 10
df2.t < - data.table(df2)
df2.t [,clade:= ifelse(变量%in%c(pristionchus_pacificus,caenorhabditis_​​elegans, ancylostoma_ceylanicum,necator_americanus,nippostrongylus_brasiliensis,angiostrongylus_costaricensis,dictyocaulus_viviparus,haemonchus_contortus),进化枝V,
ifelse(变异百分比%c(meloidogyne_hapla,panagrellus_redivivus, rhabditophanes_kr3021,strongyloides_ratti),分支IV,
ifelse(%c(toxocara_canis,dracunculus_medinensis,loa_loa,onchocerca_volvulus,ascaris_suum,brugia_malayi litomosoides_sigmodontis,syphacia_muris,thelazia_callipaeda),分支III,
ifelse(变量%%c(romanomermis_culicivorax,trichinella_spiralis,trichuris_muris),分支I,
ifelse(在%c中的变量%(棘球蚴多形体,hymenolepis_microstoma,mesocestoides_corti,taenia_solium,schistocephalus_solid我们),Cestoda,
ifelse(变量%in%c(clonorchis_sinensis,fasciola_hepatica,schistosoma_japonicum,schistosoma_mansoni),Trematoda,NA)))))]] $ (Clade I,Clade III,Clade IV,Clade V,Cestoda,Trematoda), ))
plot2 < - ggplot(df2.t,aes(variable,Family))
tile2 < - plot2 + geom_tile(aes(fill = value))+ facet_grid(Class〜clade,scale =free,space =free)
tile2 < - tile2 + scale_x_discrete(expand = c(0,0))+ scale_y_discrete(expand = c(0,0))
tile2< ; - tile2 + theme(axis.text.y = element_blank(),axis.ticks.y = element_blank(),legend.position =right,axis.text.x = element_text(angle = 90,hjust = 1, vjust = 0.55),axis.text.y = element_text(size = rel(0.35)),panel.border = element_rect(fill = NA,color =gray,size = 0.5,linetype =solid))
tile2 < - tile2 + xlab(NULL)
tile2 < - tile2 + scale_fill_gradientn(breaks = c(1,2,3,4,5,6,7,8,9,10)),标签= c(1,2,3,4,5,6,7,8,9,10> 1,10),colors = palette(11),na.value =white,name =Members)'

正如您所看到的,涉及到很多手动重新排序,否则代码非常简单。这里是图像输出:





但是,您可能会注意到整列信息Subclass未被使用。基本上,每个子类都适合一个类。如果我已经能够在已经显示的类方面内部展示这些子类,那将是完美的。据我所知,这是不可能的。确切地说,只有A类具有不同的子类。其他类只有它们的类名镜像(F = class f)。是否有另一种方式来组织这个热图,以便我可以显示所有相关信息?缺失的子类包含一些最重要的数据,并且是从数据中得出推断的最必要的数据。



另一种方法是面向Subclases而不是类,手动对它们重新排序,以便将类聚集在一起,然后在它们周围绘制某种框以划分每个类。我不知道如何做到这一点。



任何帮助将非常有用。如果您需要任何附加信息,请告诉我们。

解决方案

这会在orignal strip的右侧放置一个新的strip, (b)
$ b

  library(ggplot2)
library(gtable)
library(格网)

p < - ggplot(mtcars,aes(mpg,wt,color = factor(vs)))+ geom_point()
p <-p + facet_grid

#将图转换为grob
gt< - ggplotGrob(p)

#获取布局中右侧条的位置:t = top ,l = left,...
strip< -c(子集(gt $ layout,grepl(strip-r,gt $ layout $ name),select = t:r))

#当前条带右边的新栏位
gt< - gtable_add_cols(gt,gt; $ widths [9],max(strip $ r))

#添加grob (gp = gpar(col = NA,fill =grey85,size = .5)),$($),$新元素, b $ b textGrob(圆柱体数量,rot = -90,vjust = .27,
gp = gpar(cex = .75,fontface =bold,col =black))),
t = min(strip $ t),l = max(strip $ r)+1,b = max (strip $ b),name = c(a,b))

#在条带之间添加小间隙
gt < - gtable_add_cols(gt,unit(1/5 ,line),max(strip $ r))

#绘制
grid.newpage()
grid.draw(gt)


I have a heatmap that continues to become more and more complex. An example of the melted data:

head(df2)
  Class     Subclass         Family               variable value
1     A chemosensory family_1005117 caenorhabditis_elegans    10
2     A chemosensory family_1011230 caenorhabditis_elegans     4
3     A chemosensory family_1022539 caenorhabditis_elegans    10
4     A        other family_1025293 caenorhabditis_elegans    NA
5     A chemosensory family_1031345 caenorhabditis_elegans    10
6     A chemosensory family_1033309 caenorhabditis_elegans    10
tail(df2)
     Class Subclass        Family        variable value
6496     C  class c family_455391 trichuris_muris     1
6497     C  class c family_812893 trichuris_muris    NA
6498     F  class f family_225491 trichuris_muris     1
6499     F  class f family_236822 trichuris_muris     1
6500     F  class f family_276074 trichuris_muris     1
6501     F  class f family_768194 trichuris_muris    NA

Using ggplot2 and geom_tile, I was able to produce a beautiful heatmap of the data. I am proud of the code (this is my first experience in R), so have posted it below:

df2[df2 == 0] <- NA
df2[df2 > 11] <- 10
df2.t <- data.table(df2)
df2.t[, clade := ifelse(variable %in% c("pristionchus_pacificus", "caenorhabditis_elegans", "ancylostoma_ceylanicum", "necator_americanus", "nippostrongylus_brasiliensis", "angiostrongylus_costaricensis", "dictyocaulus_viviparus", "haemonchus_contortus"), "Clade V",
                 ifelse(variable %in% c("meloidogyne_hapla","panagrellus_redivivus", "rhabditophanes_kr3021", "strongyloides_ratti"), "Clade IV",
                 ifelse(variable %in% c("toxocara_canis", "dracunculus_medinensis", "loa_loa", "onchocerca_volvulus", "ascaris_suum", "brugia_malayi", "litomosoides_sigmodontis", "syphacia_muris", "thelazia_callipaeda"), "Clade III",
                 ifelse(variable %in% c("romanomermis_culicivorax", "trichinella_spiralis", "trichuris_muris"), "Clade I",
                 ifelse(variable %in% c("echinococcus_multilocularis", "hymenolepis_microstoma", "mesocestoides_corti", "taenia_solium", "schistocephalus_solidus"), "Cestoda",
                 ifelse(variable %in% c("clonorchis_sinensis", "fasciola_hepatica", "schistosoma_japonicum", "schistosoma_mansoni"), "Trematoda", NA))))))]
df2.t$clade <- factor(df2.t$clade, levels = c("Clade I", "Clade III", "Clade IV", "Clade V", "Cestoda", "Trematoda"))
plot2 <- ggplot(df2.t, aes(variable, Family))
tile2 <- plot2 + geom_tile(aes(fill = value)) + facet_grid(Class ~ clade, scales = "free", space = "free")
tile2 <- tile2 + scale_x_discrete(expand = c(0,0)) + scale_y_discrete(expand = c(0,0))
tile2 <- tile2 + theme(axis.text.y = element_blank(), axis.ticks.y = element_blank(), legend.position = "right", axis.text.x = element_text(angle = 90, hjust = 1, vjust = 0.55), axis.text.y = element_text(size = rel(0.35)), panel.border = element_rect(fill=NA,color="grey", size=0.5, linetype="solid"))
tile2 <- tile2 + xlab(NULL)
tile2 <- tile2 + scale_fill_gradientn(breaks = c(1,2,3,4,5,6,7,8,9,10),labels = c("1", "2", "3", "4", "5", "6", "7", "8", "9", ">10"), limits = c(1, 10), colours = palette(11), na.value = "white", name = "Members")'

As you can see, there is quite a bit of manual reordering involved, otherwise the code is pretty simple. Here's the image output:

However, you may notice that a whole column of information, "Subclass" is not utilized. Basically, each Subclass fits within a Class. It would be perfect if I was able to facet these Subclasses within the Class facet already displayed. As far as I know, this is impossible. To be precise, only Class A has varying Subclasses. The other Classes simply have their class name mirrored (F = class f). Is there another way to organize this heatmap so that I can display all of the relevant information? The missing Subclasses contain some of the most crucial data and will be the most necessary for drawing inferences from the data.

An alternative approach would be to facet the Subclases instead of the Classes, manually reorder them so that the Classes are clustered together, and then draw some sort of box around them to demarcate each Class. I have no idea how this would be done.

Any help would be very useful. Please let me know if you need any additional information.

解决方案

This will put a new strip to the right of the orignal strip, and to the left of the legend.

library(ggplot2)
library(gtable)
library(grid)

p <- ggplot(mtcars, aes(mpg, wt, colour = factor(vs))) + geom_point()
p <- p + facet_grid(cyl ~ gear)

# Convert the plot to a grob
gt <- ggplotGrob(p)

# Get the positions of the right strips in the layout: t = top, l = left, ...
strip <-c(subset(gt$layout, grepl("strip-r", gt$layout$name), select = t:r))

#  New column to the right of current strip
gt <- gtable_add_cols(gt, gt$widths[9], max(strip$r))  

# Add grob, the new strip, into new column
gt <- gtable_add_grob(gt, 
  list(rectGrob(gp = gpar(col = NA, fill = "grey85", size = .5)),
  textGrob("Number of Cylinders", rot = -90, vjust = .27, 
        gp = gpar(cex = .75, fontface = "bold", col = "black"))), 
        t = min(strip$t), l = max(strip$r) + 1, b = max(strip$b), name = c("a", "b"))

# Add small gap between strips
gt <- gtable_add_cols(gt, unit(1/5, "line"), max(strip$r))

# Draw it
grid.newpage()
grid.draw(gt)

这篇关于ggplot2:更复杂的刻面的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆