有效地将相似的数字分组在一起 [英] Efficiently grouping similar numbers together

查看:39
本文介绍了有效地将相似的数字分组在一起的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

可能的重复:
一维数字数组聚类

我有一个数字数组,例如 [1, 20, 300, 45, 5, 60, 10, 270, 3].基于接近度将这些数字分组在一起的有效算法是什么?在这种情况下,我希望像 [1, 3, 5], [20, 45, 60][270, 300] 之类的东西>.

I have an array of numbers like [1, 20, 300, 45, 5, 60, 10, 270, 3]. What is an efficient algorithm for grouping these numbers together based on proximity? In this case I'd expect something like [1, 3, 5], [20, 45, 60] and [270, 300].

推荐答案

您要问的最困难的部分是如何实际定义接近度.您希望 [5,10,15,20] 的输出是什么?是否与 [500,1000,1500,2000] 的分组相同?

The hardest part of what you are asking is how to actually define proximity. What would you expect the output to be from [5,10,15,20]? Would it be the same groupings as for [500,1000,1500,2000]?

[1,2,3,5,7,8,9] 怎么样?应该是一组还是三组?(或两个?).
[1,2,3,5,7,8,9,1075,4000] 怎么样?1075 和 4000 会组合在一起吗?样本中较大的数字会改变较小数字的分组吗?

What about [1,2,3,5,7,8,9]? Should there be one group or three? (or two?).
What about [1,2,3,5,7,8,9,1075,4000]? Do 1075 and 4000 become grouped together? Does the groupings of the smaller numbers become changed by the larger numbers in the sample?

整个机器学习领域都在问这个问题:集群分析也许这个相关问题会有所帮助?

This question is something asked by an entire field of Machine Learning: Cluster Analysis Perhaps this related question will help?

我认为您想要的是 K-means 聚类(有助于链接到相关问题),但您需要知道要将数据分成多少组才能使用它.

I think what you want is K-means clustering (helpfully linked to in the related question), but you need to know how many groups you want to split your data into to use it.

这篇关于有效地将相似的数字分组在一起的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆