R聚类-带有观察标签的轮廓 [英] R clustering- silhouette with observation labels

查看:59
本文介绍了R聚类-带有观察标签的轮廓的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用R中的 cluster 包进行分层聚类.使用 silhouette 函数,可以获得任何给定高度(h)的聚类输出的轮廓图.树状图中的截止点.

I do hierarchical clustering with the cluster package in R. Using the silhouette function, I can get the silhouette plot of my cluster output for any given height (h) cut-off in the dendrogram.

# run hierarchical clustering
if(!require("cluster")) { install.packages("cluster");  require("cluster") } 
tmp <- matrix(c( 0,  20,  20,  20,  40,  60,  60,  60, 100, 120, 120, 120,
                 20,   0,  30,  50,  60,  80,  40,  80, 120, 100, 140, 120,
                 20,  30,   0,  40,  60,  80,  80,  80, 120, 140, 140,  80,
                 20,  50,  40,   0,  60,  80,  80,  80, 120, 140, 140, 140,
                 40,  60,  60,  60,   0,  20,  20,  20,  60,  80,  80,  80,
                 60,  80,  80,  80,  20,   0,  20,  20,  40,  60,  60,  60,
                 60,  40,  80,  80,  20,  20,   0,  20,  60,  80,  80,  80,
                 60,  80,  80,  80,  20,  20,  20,   0,  60,  80,  80,  80,
                 100, 120, 120, 120,  60,  40,  60,  60,   0,  20,  20,  20,
                 120, 100, 140, 140,  80,  60,  80,  80,  20,   0,  20,  20,
                 120, 140, 140, 140,  80,  60,  80,  80,  20,  20,   0,  20,
                 120, 120,  80, 140,  80,  60,  80,  80,  20,  20,  20,   0),
                 nr=12, dimnames=list(LETTERS[1:12], LETTERS[1:12]))

cl <- hclust(as.dist(tmp,diag = TRUE, upper = TRUE), method= 'single')
sil_cl <- silhouette(cutree(cl, h=25) ,as.dist(tmp), title=title(main = 'Good'))
plot(sil_cl)

这给出了下图,这使我感到沮丧.如何在轮廓图中使用观测标签 rownames(tmp),而不是数字索引(1到12),这对我来说毫无意义.

This gives the figure below, which is the point that frustrates me. How can I use the observation labels rownames(tmp) in the silhouette plot as opposed to the numeric indices (1 to 12) - which make no sense whatsoever to me.

推荐答案

我不确定为什么,但是 silhouette 调用似乎删除了行名.您可以使用

I'm not sure why but the silhouette call seems to drop the row names. You can add them back with

cl <- hclust(as.dist(tmp,diag = TRUE, upper = TRUE), method= 'single')
sil_cl <- silhouette(cutree(cl, h=25) ,as.dist(tmp), title=title(main = 'Good'))

rownames(sil_cl) <- rownames(tmp)

plot(sil_cl)

这篇关于R聚类-带有观察标签的轮廓的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆