如何为树冠聚类选择T1和T2阈值? [英] How to pick the T1 and T2 threshold values for Canopy Clustering?

查看：282 发布时间：2020/4/26 10:22:42 cluster-analysis subset k-means

本文介绍了如何为树冠聚类选择T1和T2阈值?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试与K-Means一起实现Canopy聚类算法.我在网上做了一些搜索，说要使用Canopy聚类来获取您的初始起点并输入到K均值中，问题是，在Canopy聚类中，您需要为该冠层指定2个阈值:T1和T2，其中内部阈值中的点与该树冠紧密相关，而较宽阈值中的点与该树冠的相关性较小.如何确定这些阈值或距树冠中心的距离?

I am trying to implement the Canopy clustering algorithm along with K-Means. I've done some searching online that says to use Canopy clustering to get your initial starting points to feed into K-means, the problem is, in Canopy clustering, you need to specify 2 threshold values for the canopy: T1 and T2, where points in the inner threshold are strongly tied to that canopy and the points in the wider threshold are less tied to that canopy. How are these threshold, or distances from the canopy center, determined?

问题上下文:

我要解决的问题是，我有一组数字，例如[1,30]或[1,250]，其集合大小约为50.可以有重复的元素，并且它们可以是浮点数，例如好吧，例如8、17.5、17.5、23、66等...我想找到最佳的聚类或一组数字的子集.

The problem I'm trying to solve is, I have a set of numbers such as [1,30] or [1,250] with set sizes of about 50. There can be duplicate elements and they can be floating point numbers as well, such as 8, 17.5, 17.5, 23, 66, ... I want to find the optimal clusters, or subsets of the set of numbers.

因此，如果用K均值聚类的Canopy聚类是一个不错的选择，那么我的问题仍然存在:如何找到T1，T2值?如果这不是一个好的选择，是否有更好，更简单但有效的算法可以使用?

So, if Canopy clustering with K-means is a good choice, then my questions still stands: how do you find the T1, T2 values?. If this is not a good choice, is there a better, simpler but effective algorithm to use?

如何为树冠聚类选择T1和T2阈值? [英] How to pick the T1 and T2 threshold values for Canopy Clustering?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

如何为树冠聚类选择T1和T2阈值? [英] How to pick the T1 and T2 threshold values for Canopy Clustering?

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭