使用ggplot2着色树状图中的簇 [英] Colorize Clusters in Dendogram with ggplot2
问题描述
这里是代码:
labs = paste(sta _,1:50,sep =)#new labels
rownames(USArrests)< -labs #set new行名
hc < - hclust(dist(USArrests),ave)
图书馆(ggplot2)
图书馆(ggdendro)
#转换群集对象以使用ggplot
dendr< - dendro_data(hc,type =rectangle)
您自己的标签在geom_text()和label = label $ b $中提供(数据=段(dendr),aes(x = x,y = y,xend = xend,yend = yend))+
geom_text(data = label(dendr) ,aes(x = x,y = y,label = label,hjust = 0),size = 3)+
coord_flip()+ scale_y_reverse(expand = c(0.2,0))+
theme (axis.line.y = element_blank(),
axis.ticks.y = element_blank(),
axis.text.y = element_blank(),
axis.title.y = element_blank(),
panel.background = element_rect(fill =white),
panel.grid = element_blank ())
有谁知道,如何着色不同的集群?例如,您希望将2个群集(k = 2)着色?解决方案
解决方法是将群集对象与 plot()
,然后使用函数 rect.hclust()
在集群周围绘制边框(nunber集群设置为参数 k =
)。如果将 rect.hclust()
的结果保存为对象,它将生成观察列表,其中每个列表元素包含属于每个集群的观察值。
plot(hc)
pre>
gg <-rect.hclust(hc,k = 2)
现在这个列表可以转换为数据框,其中列
clust
包含集群的名称(在本例中为两个组) - 名称根据列表元素的长度重复。
clust.gr <-data.frame(num = unlist(gg),
clust = rep(c( clust1,Clust2),times = sapply(gg,length)))
head(clust.gr)
num clust
sta_1 1 Clust1
sta_2 2 Clust1
sta_3 3 Clust1
sta_5 5 Clust1
sta_8 8 Clust1
sta_9 9 Clust1
新数据框与
label()
信息合并为dendr
object(dendro_data()
result)。text.df< -merge(label(dendr) ,clust.gr,by.x =label,by.y =row.names)
head(text.df)
label xy num clust
1 sta_1 8 0 1 Clust1
2 sta_10 28 0 10 Clust2
3 sta_11 41 0 11 Clust2
4 sta_12 31 0 12 Clust2
5 sta_13 10 0 13 Clust1
6 sta_14 37 0 14 Clust2
当绘制树状图时,使用
text.df
使用geom_text()
添加标签,并为颜色添加列clust
。ggp lot()+
geom_segment(data = segment(dendr),aes(x = x,y = y,xend = xend,yend = yend))+
geom_text(data = text.df,aes (x = x,y = y,label = label,hjust = 0,color = clust),size = 3)+
coord_flip()+ scale_y_reverse(expand = c(0.2,0))+
主题(axis.line.y = element_blank(),
axis.ticks.y = element_blank(),
axis.text.y = element_blank(),
axis.title.y = element_blank(),
panel.background = element_rect(fill =white),
panel.grid = element_blank())
Didzis Elferts showed how to plot a dendogram using ggplot2 and ggdendro:
horizontal dendrogram in R with labels
here is the code:
labs = paste("sta_",1:50,sep="") #new labels rownames(USArrests)<-labs #set new row names hc <- hclust(dist(USArrests), "ave") library(ggplot2) library(ggdendro) #convert cluster object to use with ggplot dendr <- dendro_data(hc, type="rectangle") #your own labels are supplied in geom_text() and label=label ggplot() + geom_segment(data=segment(dendr), aes(x=x, y=y, xend=xend, yend=yend)) + geom_text(data=label(dendr), aes(x=x, y=y, label=label, hjust=0), size=3) + coord_flip() + scale_y_reverse(expand=c(0.2, 0)) + theme(axis.line.y=element_blank(), axis.ticks.y=element_blank(), axis.text.y=element_blank(), axis.title.y=element_blank(), panel.background=element_rect(fill="white"), panel.grid=element_blank())
Does anyone know, how to colorize the different clusters? For example, you want to have 2 Clusters (k=2) colorized?
解决方案Workaround would be to plot cluster object with
plot()
and then use functionrect.hclust()
to draw borders around the clusters (nunber of clusters is set with argumentk=
). If result ofrect.hclust()
is saved as object it will make list of observation where each list element contains observations belonging to each cluster.plot(hc) gg<-rect.hclust(hc,k=2)
Now this list can be converted to dataframe where column
clust
contains names for clusters (in this example two groups) - names are repeated according to lengths of list elemets.clust.gr<-data.frame(num=unlist(gg), clust=rep(c("Clust1","Clust2"),times=sapply(gg,length))) head(clust.gr) num clust sta_1 1 Clust1 sta_2 2 Clust1 sta_3 3 Clust1 sta_5 5 Clust1 sta_8 8 Clust1 sta_9 9 Clust1
New data frame is merged with
label()
information ofdendr
object (dendro_data()
result).text.df<-merge(label(dendr),clust.gr,by.x="label",by.y="row.names") head(text.df) label x y num clust 1 sta_1 8 0 1 Clust1 2 sta_10 28 0 10 Clust2 3 sta_11 41 0 11 Clust2 4 sta_12 31 0 12 Clust2 5 sta_13 10 0 13 Clust1 6 sta_14 37 0 14 Clust2
When plotting dendrogram use
text.df
to add labels withgeom_text()
and use columnclust
for colors.ggplot() + geom_segment(data=segment(dendr), aes(x=x, y=y, xend=xend, yend=yend)) + geom_text(data=text.df, aes(x=x, y=y, label=label, hjust=0,color=clust), size=3) + coord_flip() + scale_y_reverse(expand=c(0.2, 0)) + theme(axis.line.y=element_blank(), axis.ticks.y=element_blank(), axis.text.y=element_blank(), axis.title.y=element_blank(), panel.background=element_rect(fill="white"), panel.grid=element_blank())
这篇关于使用ggplot2着色树状图中的簇的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!