R聚类-带有观察标签的轮廓 [英] R clustering- silhouette with observation labels
问题描述
我使用R中的 cluster
包进行分层聚类.使用 silhouette
函数,可以获得任何给定高度(h)的聚类输出的轮廓图.树状图中的截止点.
I do hierarchical clustering with the cluster
package in R. Using the silhouette
function, I can get the silhouette plot of my cluster output for any given height (h) cut-off in the dendrogram.
# run hierarchical clustering
if(!require("cluster")) { install.packages("cluster"); require("cluster") }
tmp <- matrix(c( 0, 20, 20, 20, 40, 60, 60, 60, 100, 120, 120, 120,
20, 0, 30, 50, 60, 80, 40, 80, 120, 100, 140, 120,
20, 30, 0, 40, 60, 80, 80, 80, 120, 140, 140, 80,
20, 50, 40, 0, 60, 80, 80, 80, 120, 140, 140, 140,
40, 60, 60, 60, 0, 20, 20, 20, 60, 80, 80, 80,
60, 80, 80, 80, 20, 0, 20, 20, 40, 60, 60, 60,
60, 40, 80, 80, 20, 20, 0, 20, 60, 80, 80, 80,
60, 80, 80, 80, 20, 20, 20, 0, 60, 80, 80, 80,
100, 120, 120, 120, 60, 40, 60, 60, 0, 20, 20, 20,
120, 100, 140, 140, 80, 60, 80, 80, 20, 0, 20, 20,
120, 140, 140, 140, 80, 60, 80, 80, 20, 20, 0, 20,
120, 120, 80, 140, 80, 60, 80, 80, 20, 20, 20, 0),
nr=12, dimnames=list(LETTERS[1:12], LETTERS[1:12]))
cl <- hclust(as.dist(tmp,diag = TRUE, upper = TRUE), method= 'single')
sil_cl <- silhouette(cutree(cl, h=25) ,as.dist(tmp), title=title(main = 'Good'))
plot(sil_cl)
这给出了下图,这使我感到沮丧.如何在轮廓图中使用观测标签 rownames(tmp)
,而不是数字索引(1到12),这对我来说毫无意义.
This gives the figure below, which is the point that frustrates me. How can I use the observation labels rownames(tmp)
in the silhouette plot as opposed to the numeric indices (1 to 12) - which make no sense whatsoever to me.
推荐答案
我不确定为什么,但是 silhouette
调用似乎删除了行名.您可以使用
I'm not sure why but the silhouette
call seems to drop the row names. You can add them back with
cl <- hclust(as.dist(tmp,diag = TRUE, upper = TRUE), method= 'single')
sil_cl <- silhouette(cutree(cl, h=25) ,as.dist(tmp), title=title(main = 'Good'))
rownames(sil_cl) <- rownames(tmp)
plot(sil_cl)
这篇关于R聚类-带有观察标签的轮廓的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!