聚类-如何找到离聚类最近的 [英] Clustering - how to find the nearest to a cluster
问题描述
关于另一个问题的提示使我很困惑。
Hints I got as to a different question puzzled me quite a bit.
我做了一个练习,实际上是更大练习的一部分:
I got an exercise, actually part of a larger exercise:
- 使用hclust(完成)聚类一些数据
- 给出一个全新的向量,找出您要对哪个集群进行进来1就是最接近的。
根据练习,应该在很短的时间内完成。
According to the excercise, this should be done in quite short a time.
但是,几周后我不知道这是否可以完成,因为我从hclust那里真正得到的只是一棵树,而不是我想象的许多簇。
However, after weeks I am puzzled whether this can be done at all, as apparently all I really get from hclust is a tree - and not, as I assumed, a number of clusters.
我想我还不清楚:
例如,我要输入一个包含15个矩阵1x5向量,5倍(1 1 1 1 1),5倍(2 2 2 2 2)和5倍(3 3 3 3 3)。这应该给我三个大小分别为5的截然不同的群集,任何人都可以轻松地手动完成。有没有要使用的命令,以便我实际上可以从程序中找出hclust对象中有3个此类集群以及它们包含的集群?
Say, for instance, I feed hclust a matrix which consists of 15 1x5 Vectors, 5 times (1 1 1 1 1 ), 5 times (2 2 2 2 2) and 5 times (3 3 3 3 3). This should give me three quite distinct clusters of size 5, anyone can easily do that by hand. Is there a command to use so that I can actually find out from the program that there are 3 such clusters in my hclust-object and what they contain?
推荐答案
与k均值相比,hclust发现的簇可以具有任意形状。
In contrast to k-means, clusters found by hclust can be of arbitrary shape.
因此,到最近的簇中心的距离并不总是
The distance to the nearest cluster center therefore is not always meaningful.
进行1个最近邻居样式分配可能更好。
Doing a 1 nearest neighbor style assignment probably is better.
这篇关于聚类-如何找到离聚类最近的的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!