scipy.cluster.hierarchy:标签的顺序似乎不正确,并且被垂直轴的值弄糊涂了 [英] scipy.cluster.hierarchy: labels seems not in the right order, and confused by the value of the vertical axes

查看:103
本文介绍了scipy.cluster.hierarchy:标签的顺序似乎不正确,并且被垂直轴的值弄糊涂了的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道 scipy.cluster.hierarchy 专注于处理距离矩阵.但是现在我有了一个相似度矩阵……在我使用树状图绘制它之后,发生了一些奇怪的事情.代码如下:

I know that scipy.cluster.hierarchy focused on dealing with the distance matrix. But now I have a similarity matrix... After I plot it by using Dendrogram, something weird just happens. Here is the code:

similarityMatrix = np.array(([1,0.75,0.75,0,0,0,0],
                         [0.75,1,1,0.25,0,0,0],
                         [0.75,1,1,0.25,0,0,0],
                         [0,0.25,0.25,1,0.25,0.25,0],
                         [0,0,0,0.25,1,1,0.75],
                         [0,0,0,0.25,1,1,0.75],
                         [0,0,0,0,0.75,0.75,1]))

这是链接方法

Z_sim = sch.linkage(similarityMatrix)
plt.figure(1)
plt.title('similarity')
sch.dendrogram(
    Z_sim,
    labels=['1','2','3','4','5','6','7']
)
plt.show()

但这是结果:

我的问题是:

  1. 为什么这个树状图的标签不正确?
  2. 我为链接方法提供了一个相似度矩阵,但我无法完全理解垂直轴的含义.比如最大相似度为1,为什么纵轴的最大值接近1.6?

非常感谢您的帮助!

推荐答案

  • linkage 需要距离",而不是相似性".要将您的矩阵转换为距离矩阵之类的东西,您可以将其从 1 中减去:

    • linkage expects "distances", not "similarities". To convert your matrix to something like a distance matrix, you can subtract it from 1:

      dist = 1 - similarityMatrix
      

    • 链接不接受平方距离矩阵.它期望距离数据采用压缩"形式.您可以使用 scipy.spatial.distance.squareform :

    • linkage does not accept a square distance matrix. It expects the distance data to be in "condensed" form. You can get that using scipy.spatial.distance.squareform:

      from scipy.spatial.distance import squareform
      
      dist = 1 - similarityMatrix
      condensed_dist = squareform(dist)
      Z_sim = sch.linkage(condensed_dist)
      

      (当您将形状为 (m, n) 的二维数组传递给 linkage 时,它会将行视为 n 维空间中的点,并在内部计算距离.)

    • (When you pass a two-dimensional array with shape (m, n) to linkage, it treats the rows as points in n-dimensional space, and computes the distances internally.)

      这篇关于scipy.cluster.hierarchy:标签的顺序似乎不正确,并且被垂直轴的值弄糊涂了的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆