有效地将相似的数字分组在一起 [英] Efficiently grouping similar numbers together
问题描述
可能的重复:
一维数字数组聚类
我有一个数字数组,例如 [1, 20, 300, 45, 5, 60, 10, 270, 3]
.基于接近度将这些数字分组在一起的有效算法是什么?在这种情况下,我希望像 [1, 3, 5]
, [20, 45, 60]
和 [270, 300]
之类的东西>.
I have an array of numbers like [1, 20, 300, 45, 5, 60, 10, 270, 3]
. What is an efficient algorithm for grouping these numbers together based on proximity? In this case I'd expect something like [1, 3, 5]
, [20, 45, 60]
and [270, 300]
.
推荐答案
您要问的最困难的部分是如何实际定义接近度.您希望 [5,10,15,20]
的输出是什么?是否与 [500,1000,1500,2000]
的分组相同?
The hardest part of what you are asking is how to actually define proximity. What would you expect the output to be from [5,10,15,20]
? Would it be the same groupings as for [500,1000,1500,2000]
?
[1,2,3,5,7,8,9]
怎么样?应该是一组还是三组?(或两个?).[1,2,3,5,7,8,9,1075,4000]
怎么样?1075 和 4000 会组合在一起吗?样本中较大的数字会改变较小数字的分组吗?
What about [1,2,3,5,7,8,9]
? Should there be one group or three? (or two?).
What about [1,2,3,5,7,8,9,1075,4000]
? Do 1075 and 4000 become grouped together? Does the groupings of the smaller numbers become changed by the larger numbers in the sample?
整个机器学习领域都在问这个问题:集群分析也许这个相关问题会有所帮助?
This question is something asked by an entire field of Machine Learning: Cluster Analysis Perhaps this related question will help?
我认为您想要的是 K-means 聚类(有助于链接到相关问题),但您需要知道要将数据分成多少组才能使用它.
I think what you want is K-means clustering (helpfully linked to in the related question), but you need to know how many groups you want to split your data into to use it.
这篇关于有效地将相似的数字分组在一起的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!