在Python的scipy.cluster.hierarchy中将树状图与簇号匹配 [英] Matching dendrogram with cluster number in Python's scipy.cluster.hierarchy

查看:365
本文介绍了在Python的scipy.cluster.hierarchy中将树状图与簇号匹配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

以下代码生成具有10个叶节点的简单层次聚类树状图:

The following code generates a simple hierarchical cluster dendrogram with 10 leaf nodes:

import scipy
import scipy.cluster.hierarchy as sch
import matplotlib.pylab as plt

X = scipy.randn(10,2)
d = sch.distance.pdist(X)
Z= sch.linkage(d,method='complete')
P =sch.dendrogram(Z)
plt.show()

我会像这样生成三个平面簇:

I generate three flat clusters like so:

T = sch.fcluster(Z, 3, 'maxclust')
# array([3, 1, 1, 2, 2, 2, 2, 2, 1, 2])

但是,我想在树状图上看到簇标签1,2,3.仅用10个叶节点和3个群集对我来说就很容易可视化,但是当我有1000个节点和10个群集时,我看不到发生了什么.

However, I'd like to see the cluster labels 1,2,3 on the dendrogram. It's easy for me to visualize with just 10 leaf nodes and three clusters, but when I have 1000 nodes and 10 clusters, I can't see what's going on.

如何在树状图上显示簇号?我对其他包裹持开放态度.谢谢.

How do I show the cluster numbers on the dendrogram? I'm open to other packages. Thanks.

推荐答案

在这里,此解决方案可以为簇适当着色,并使用适当的簇名称标记树状图的叶子(叶标记为:点号,簇号" ).这些技术可以独立使用,也可以一起使用.我修改了您的原始示例,使其同时包括:

Here is a solution that appropriately colors the clusters and labels the leaves of the dendrogram with the appropriate cluster name (leaves are labeled: 'point number, cluster number'). These techniques can be used independently or together. I modified your original example to include both:

import scipy
import scipy.cluster.hierarchy as sch
import matplotlib.pylab as plt

n=10
k=3
X = scipy.randn(n,2)
d = sch.distance.pdist(X)
Z= sch.linkage(d,method='complete')
T = sch.fcluster(Z, k, 'maxclust')

# calculate labels
labels=list('' for i in range(n))
for i in range(n):
    labels[i]=str(i)+ ',' + str(T[i])

# calculate color threshold
ct=Z[-(k-1),2]  

#plot
P =sch.dendrogram(Z,labels=labels,color_threshold=ct)
plt.show()

这篇关于在Python的scipy.cluster.hierarchy中将树状图与簇号匹配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆