具有选定初始中心的k均值 [英] k-means with selected initial centers

查看:1184
本文介绍了具有选定初始中心的k均值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图用选定的初始质心进行k均值聚类. 它说这里 指定您的初始中心:

I am trying to k-means clustering with selected initial centroids. It says here that to specify your initial centers:

init : {‘k-means++’, ‘random’ or an ndarray} 

如果通过了ndarray,则其形状应为(n_clustersn_features),并给出初始中心.

If an ndarray is passed, it should be of shape (n_clusters, n_features) and gives the initial centers.

我在Python中的代码:

My code in Python:

X = np.array([[-19.07480000,  -8.536],
              [22.010800000,-10.9737],
              [12.659700000,19.2601]], np.float64)
km = KMeans(n_clusters=3,init=X).fit(data)
# print km
centers = km.cluster_centers_
print centers

返回错误:

RuntimeWarning: Explicit initial center position passed: performing only one init in k-means instead of n_init=10
  n_jobs=self.n_jobs)

,并返回相同的初始中心.知道如何形成初始中心以便可以被接受吗?

and return the same initial centers. Any idea how to form the initial centers so it can be accepted?

推荐答案

KMeans的默认行为是使用不同的随机质心多次初始化算法(即文档):

The default behavior of KMeans is to initialize the algorithm multiple times using different random centroids (i.e. the Forgy method). The number of random initializations is then controlled by the n_init= parameter (docs):

n_init :整数,默认值:10

n_init : int, default: 10

k均值算法将以不同的时间运行的次数 重心种子.最终结果将是的最佳输出 n_init惯性连续运行.

Number of time the k-means algorithm will be run with different centroid seeds. The final results will be the best output of n_init consecutive runs in terms of inertia.

如果将数组作为init=参数传递,则将使用数组中显式指定的质心仅执行初始化.之所以得到RuntimeWarning是因为仍在传递默认值n_init=10(

If you pass an array as the init= argument then only a single initialization will be performed using the centroids explicitly specified in the array. You are getting a RuntimeWarning because you are still passing the default value of n_init=10 (here are the relevant lines of source code).

完全可以忽略此警告,但是如果您的init=参数是数组,则可以通过传递n_init=1使其完全消失.

It's actually totally fine to ignore this warning, but you can make it go away completely by passing n_init=1 if your init= parameter is an array.

这篇关于具有选定初始中心的k均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆