有效地把相似的数字相加 [英] Efficiently grouping similar numbers together
问题描述
可能重复:
维数阵列集群
我有数字数组像 [1,20,300,45,5,60,10,270,3]
。什么是一个高效的算法基于接近分组这些数字相加?在这种情况下,我期望像 [1,3,5]
, [20,45,60]
和 [270,300]
。
I have an array of numbers like [1, 20, 300, 45, 5, 60, 10, 270, 3]
. What is an efficient algorithm for grouping these numbers together based on proximity? In this case I'd expect something like [1, 3, 5]
, [20, 45, 60]
and [270, 300]
.
推荐答案
你问的最难的是如何真正界定接近。你会期望的输出从 [5,10,15,20]
?难道是相同的分组为 [500,1000,1500,2000]
?
The hardest part of what you are asking is how to actually define proximity. What would you expect the output to be from [5,10,15,20]
? Would it be the same groupings as for [500,1000,1500,2000]
?
什么 [1,2,3,5,7,8,9]
?如果有一组或三个? (或两个?)。
那么 [1,2,3,5,7,8,9,1075,4000]
?难道1075和4000成为组合在一起?难道较小的号码分组越来越受到较大的数字样本的变化?
What about [1,2,3,5,7,8,9]
? Should there be one group or three? (or two?).
What about [1,2,3,5,7,8,9,1075,4000]
? Do 1075 and 4000 become grouped together? Does the groupings of the smaller numbers become changed by the larger numbers in the sample?
这个问题是问的东西通过机器学习的整个领域:聚类分析 也许这<一href="http://stackoverflow.com/questions/6147466/what-clustering-algorithm-to-use-on-1-d-data">related问题将帮助?
This question is something asked by an entire field of Machine Learning: Cluster Analysis Perhaps this related question will help?
我想你想要的是 K-均值聚类(有益链接到相关的问题),但你需要知道你要多少组将数据分割成使用它。
I think what you want is K-means clustering (helpfully linked to in the related question), but you need to know how many groups you want to split your data into to use it.
这篇关于有效地把相似的数字相加的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!