scipy.cluster.hierarchy.linkage的返回值是什么意思? [英] what is the meaning of the return values of the scipy.cluster.hierarchy.linkage?

查看:163
本文介绍了scipy.cluster.hierarchy.linkage的返回值是什么意思?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我们具有如下X矩阵:

Let assume that we have X matrix as follows:

[[9 0]
[1 4]
[2 3]
[8 5]]

然后

from scipy.cluster.hierarchy import linkage
Z = linkage(X, method="ward")
print(Z)

返回矩阵如下:

[[  1.           2.           1.41421356   2.        ]
 [  0.           3.           5.09901951   2.        ]
 [  4.           5.          10.           4.        ]]

返回值的含义是什么?

推荐答案

尽管它已经在回答之前就已经得到,这是一个阅读文档"的答案.我认为对文档进行一些解释很有用.

Although this has been answered before, it was a "read the docs" answer. I think it is useful to explain the docs a bit.

从文档中,我们读到:

返回(n-1)乘4的矩阵Z.在第i次迭代中,聚类索引为Z [i,0]和Z [i,1]的元素组合在一起形成簇n + i.一种索引小于n的簇对应于n个原始簇之一观察.Z [i,0]和Z [i,1]之间的距离为由Z [i,2]给出.第四值Z [i,3]表示新形成的星团中的原始观测结果.

An (n−1) by 4 matrix Z is returned. At the i-th iteration, clusters with indices Z[i, 0] and Z[i, 1] are combined to form cluster n + i. A cluster with an index less than n corresponds to one of the n original observations. The distance between clusters Z[i, 0] and Z[i, 1] is given by Z[i, 2]. The fourth value Z[i, 3] represents the number of original observations in the newly formed cluster.

我认为令人困惑的部分是前n个群集是单例(原始观测值").因此,Z中的第一个值实际上是第n + 1个簇.这是第一个将两个单例结合在一起的簇.

I think the confusing part is the the first n clusters are singletons ("original observations"). So the first value in Z actually the n+1'th cluster. It is the first cluster to combine two singletons.

因此,在您的示例中,Z [0]是第4 + 1个簇.我们有

So in your example, Z[0] is the 4+1'th cluster. We have

 Z[0] = [  1.           2.           1.41421356   2.        ]

前两个值告诉我们用于创建簇Z [0]的簇.它们是cluster_1(单例[1,4])和cluster_2(单例[2,3]).

The first two values tell us which clusters were used to create cluster Z[0]. They are cluster_1, the singleton [1,4], and cluster_2, the singleton [2, 3].

第三个值给出了簇之间的距离.我们可以验证sqrt((2-1)^ 2 +(3-4)^ 2))= 1.41 ...

The third value gives us the distance between the clusters. We can verify that sqrt((2-1)^2 + (3-4)^2)) = 1.41...

第四个值告诉我们簇Z [0]中有多少个单例.

The fourth value tells us how many singletons are in cluster Z[0].

因此,查看最后一个群集Z [2],我们看到它是将Z的两个群集组合在一起.每个群集包含两个唯一的单例,因此Z [2,3] = 4.

So looking at your last cluster, Z[2], we see that is combines the firs two clusters in Z. Each of them contains two unique singletons, so the Z[2,3] = 4.

这篇关于scipy.cluster.hierarchy.linkage的返回值是什么意思?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆