您如何制作热图并使用NA值进行聚类? [英] How do you make a heat map and cluster with NA values?

查看:569
本文介绍了您如何制作热图并使用NA值进行聚类?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用我的数据制作热图,但是很难正确编码.
我的矩阵充满了log(x + 1)值,这样我就不会遇到log(0)错误,但是由于我的数据的性质,我有一堆0值,它们掩盖了热图的任何趋势可能会显示.因此,我想将任何0值着色为灰色或黑色,然后将其余数据沿蓝白红光谱着色.

I am trying to make a heat map using my data however struggle to code it properly.
My matrix is filled with log(x+1) values, this way I don't encounter log(0) errors however due to the nature of my data I have a bunch of 0 values and they mask any sort of trends the heat map could be showing. Because of that I want to colour any 0 values grey or black and then the rest of my data colour along a blue-white-red spectrum.

这是我正在使用的编码,

Here is the coding I am using,

RHeatmap <- read.delim("~/Desktop/RHeatmap.txt", row.names=1, stringsAsFactors = FALSE)

my_palette <- colorRampPalette(c("blue", "white", "red")) (n=20)
RHeatmap.matrix <- as.matrix(RHeatmap)
RHeatmap.matrix[RHeatmap.matrix==0]=NA

heatmap.2(RHeatmap.matrix,trace="none",col = my_palette, margins =  c(5,1),scale = "none", symbreaks = FALSE, Colv=TRUE, dendrogram="both",lwid=c(1.5,2.0)) 

在网上寻找如何为0值分配单独的颜色时,我注意到人们将它们分配为N.然后可以将其编码为某种颜色.问题1:我该怎么做?

When looking online for how to assign the 0 values a separate colour I noticed people assign them as N.As which can then be coded to appear a certain colour. Question 1: How would I do that?

我还想知道如何使用N.A值进行聚类,当我尝试接收错误消息时,说不能使用N.A值进行聚类.

I also was wondering how I cluster with N.A values, when I tried I received an error saying you can't cluster with N.A values.

推荐答案

要使其正常工作,您需要指定间隔.注意:间隔必须比颜色多一个.

To get this to work you need to specify the breaks.Note: There needs to be one more break than colors.

library(gplots)

dat <- matrix(2**rnorm(900, sd = 5), ncol=9)
dat[sample(seq_along(dat), size = 180)] <- 0 ##setting some data to 0

my_palette <- colorRampPalette(c("yellow", "orange", "red")) (n=20)
breaks <- seq(min(dat2, na.rm = T), max(dat2, na.rm = T), length.out = 21)

dat2 <- log2(dat+1)
dat2[dat2 == 0] <- NA
heatmap.2(dat2, trace="none", na.color = "black", scale="none", 
          col = my_palette, breaks=breaks)

关于您更笼统的可视化问题的两分钱:

1)您的所有数据都大于0,因此我建议您使用顺序色图,而不是发散色图.白色倾向于被视为0,例如在这种情况下,我看到白色,然后自动将其变为0.

1) All of your data is above 0 so I would recommend using a sequential color map, not a divergent color map. White tends to be viewed as 0, like in this case I see white and automatically thing it is 0.

2)您当前的热图对我来说很好,即很好地聚类并表示出来(不包括色图).我不确定它能获得多少更好",或者看起来会是什么更好".

2) Your current heatmap looks good to me, i.e. well clustered and represented (color map aside). I'm not sure how much "better" it could get or what "better" would look like.

3)如果您的数据中包含0,那么只要它们有意义,我就会保留它们.这是非常依赖于数据的.

3) If your data has 0's in it I would keep them, so long as they are meaningful. This is very data dependent.

4)您可以研究不同的距离度量标准,以不同的方式对待/加权0项.

4) You could look into different distance metrics that may treat/weight 0 entries differently.

5)将0s设置为NA将更改聚类,因为默认情况下,仅根据完整案例计算距离.有关更多信息,请参见dist.

5) Setting 0s to NA will change the clustering because distances are calculated on complete cases only, by default. Seedist for more info.

这篇关于您如何制作热图并使用NA值进行聚类?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆