维数阵列集群 [英] 1D Number Array Clustering

查看:199
本文介绍了维数阵列集群的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

可能重复:结果
  集群一维数据优化?

所以我们可以说我有一个这样的数组:

So let's say I have an array like this:

[1,1,2,3,10,11,13,67,71]

有到阵列分成像这样一个便捷的方式?

Is there a convenient way to partition the array into something like this?

[[1,1,2,3],[10,11,13],[67,71]]

我通过类似的问题显得但大多数人使用K-均值聚类点建议,如 SciPy的,这是相当混乱用于像我这样的初学者。此外,我认为K-手段更适合二维以上的集群吧?有没有进行分区取决于数N数字数组许多分区/集群的方式?

I looked through similar questions yet most people suggested using k-means to cluster points, like scipy, which is quite confusing to use for a beginner like me. Also I think that k-means is more suitable for two or more dimensional clustering right? Are there any ways to partition an array of N numbers to many partitions/clustering depending on the numbers?

有人也建议刚性范围分区,但它并不总是呈现结果
 预计

Some people also suggest rigid range partitioning, but it doesn't always render the results as expected

推荐答案

不要使用多维集群算法一维问题。单个维度是的的多特比你天真地认为,因为你可以真正的排序的它,这使事情变得更加简单。

Don't use multidimensional clustering algorithms for a one-dimensional problem. A single dimension is much more special than you naively think, because you can actually sort it, which makes things a lot easier.

在实际上,它是通常甚至不被称为聚类,但例如分割或自然断裂的优化。

In fact, it is usually not even called clustering, but e.g. segmentation or natural breaks optimization.

您可能想看看詹克斯自然间断优化和类似的统计方法。 核密度估计 也看,具有强大的统计背景的好方法。密度局部极小都可以将数据分割成群集的好地方,具有统计学的理由这样做。 KDE是也许集群一维数据的最完善的方法。

You might want to look at Jenks Natural Breaks Optimization and similar statistical methods. Kernel Density Estimation is also a good method to look at, with a strong statistical background. Local minima in density are be good places to split the data into clusters, with statistical reasons to do so. KDE is maybe the most sound method for clustering 1-dimensional data.

使用KDE,它再次变得明显,1维数据更表现良好。在1D,你有局部极小;但在2D你可能有鞍点,这种可能分裂分。看到一个鞍点的此维基画像,如何这样的点可以或可能不适合用于分割集群。

With KDE, it again becomes obvious that 1-dimensional data is much more well behaved. In 1D, you have local minima; but in 2D you may have saddle points and such "maybe" splitting points. See this Wikipedia illustration of a saddle point, as how such a point may or may not be appropriate for splitting clusters.

这篇关于维数阵列集群的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆