最佳地对一维数据进行聚类? [英] Cluster one-dimensional data optimally?

查看:1135
本文介绍了最佳地对一维数据进行聚类?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有人有论文解释 Ckmeans.1d. dp 算法有效吗?

或者:在一维中进行k均值聚类的最佳方法是什么?

解决方案

单变量k均值聚类可以基于Monge矩阵的理论结果在O(kn)时间(已排序的输入)中求解,但该方法并非如此最受欢迎的原因可能是数值不稳定,也可能是编码方面的挑战.

一个更好的选择是O(knlgn)方法,该方法现在在Ckmeans.1d.dp版本3.4.6中实现.此实现与启发式k均值一样快,但提供了保证的最优性,比启发式k均值好几个数量级,尤其对于大k而言.

Richard Bellman(1973)提出的通用动态编程解决方案没有涉及k-means问题的细节,并且隐含的运行时间为O(kn ^ 3).

Does anyone have a paper that explains how the Ckmeans.1d.dp algorithm works?

Or: what is the most optimal way to do k-means clustering in one-dimension?

解决方案

Univariate k-means clustering can be solved in O(kn) time (on already sorted input) based on theoretical results on Monge matrices, but the approach was not popular most likely due to numerical instability and also perhaps coding challenges.

A better option is an O(knlgn) method that is now implemented in Ckmeans.1d.dp version 3.4.6. This implementation is as fast as heuristic k-means but offers guaranteed optimality, orders of magnitude better than heuristic k-means especially for large k's.

The generic dynamic programming solution by Richard Bellman (1973) does not touch upon specifics of the k-means problem and the implied runtime is O(kn^3).

这篇关于最佳地对一维数据进行聚类?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆