k均值的时间复杂度是多少? [英] What is the time complexity of k-means?
问题描述
我正在浏览仅需 O(t * k * n * d)
,对于 n
( d
维)点,其中 k
是重心(或簇)的数量。这是实际的实现方式(通常在两次迭代之间随机重启)。
标准算法仅近似上述函数的局部最优值,我所见过的所有 k -均值算法也是如此。
I was going through the k-means Wikipedia page. Based on the algorithm, I think the complexity is O(n*k*i)
(n
= total elements, k
= number of cluster iteration)
So can someone explain me this statement from Wikipedia and how is this NP hard?
If
k
andd
(the dimension) are fixed, the problem can be exactly solved in timeO(ndk+1 log n)
, wheren
is the number of entities to be clustered.
It depends on what you call k-means.
The problem of finding the global optimum of the k-means objective function
is NP-hard, where Si
is the cluster i
(and there are k
clusters), xj
is the d
-dimensional point in cluster Si
and μi
is the centroid (average of the points) of cluster Si
.
However, running a fixed number t
of iterations of the standard algorithm takes only O(t*k*n*d)
, for n
(d
-dimensional) points, where k
is the number of centroids (or clusters). This what practical implementations do (often with random restarts between the iterations).
The standard algorithm only approximates a local optimum of the above function, and so do all the k-means algorithms that I've seen.
这篇关于k均值的时间复杂度是多少?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!