具有选定初始中心的 k 均值 [英] k-means with selected initial centers
问题描述
我正在尝试使用选定的初始质心进行 k 均值聚类.它说这里指定您的初始中心:
init : {‘k-means++’, ‘random’ or an ndarray}
如果一个 ndarray
被传递,它应该是形状 (n_clusters
, n_features
) 并给出初始中心.
我的 Python 代码:
X = np.array([[-19.07480000, -8.536],[22.010800000,-10.9737],[12.659700000,19.2601]], np.float64)km = KMeans(n_clusters=3,init=X).fit(data)# 打印公里数中心 = km.cluster_centers_印刷中心
返回错误:
RuntimeWarning: 显式初始中心位置通过:在 k-means 中只执行一个 init 而不是 n_init=10n_jobs=self.n_jobs)
并返回相同的初始中心.知道如何形成初始中心以使其被接受吗?
KMeans
的默认行为是使用不同的随机质心多次初始化算法(即 伪造方法).然后随机初始化的次数由 n_init=
参数控制(文档):
n_init:整数,默认值:10
k-means 算法将在不同情况下运行的次数质心种子.最终结果将是最好的输出n_init
在惯性方面连续运行.
如果您将数组作为 init=
参数传递,那么只会使用数组中明确指定的质心执行单个初始化.您收到 RuntimeWarning
因为您仍在传递 n_init=10
的默认值(这里是相关的源代码行).
忽略这个警告实际上完全没问题,但是如果你的 init=
参数是一个数组,你可以通过传递 n_init=1
让它完全消失.>
I am trying to k-means clustering with selected initial centroids. It says here that to specify your initial centers:
init : {‘k-means++’, ‘random’ or an ndarray}
If an ndarray
is passed, it should be of shape (n_clusters
, n_features
) and gives the initial centers.
My code in Python:
X = np.array([[-19.07480000, -8.536],
[22.010800000,-10.9737],
[12.659700000,19.2601]], np.float64)
km = KMeans(n_clusters=3,init=X).fit(data)
# print km
centers = km.cluster_centers_
print centers
Returns an error:
RuntimeWarning: Explicit initial center position passed: performing only one init in k-means instead of n_init=10
n_jobs=self.n_jobs)
and return the same initial centers. Any idea how to form the initial centers so it can be accepted?
The default behavior of KMeans
is to initialize the algorithm multiple times using different random centroids (i.e. the Forgy method). The number of random initializations is then controlled by the n_init=
parameter (docs):
n_init : int, default: 10
Number of time the k-means algorithm will be run with different centroid seeds. The final results will be the best output of
n_init
consecutive runs in terms of inertia.
If you pass an array as the init=
argument then only a single initialization will be performed using the centroids explicitly specified in the array. You are getting a RuntimeWarning
because you are still passing the default value of n_init=10
(here are the relevant lines of source code).
It's actually totally fine to ignore this warning, but you can make it go away completely by passing n_init=1
if your init=
parameter is an array.
这篇关于具有选定初始中心的 k 均值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!