sklearn.cluster.KMeans如何处理缺少质心(可用质心小于n_clusters)的init ndarray参数? [英] How does sklearn.cluster.KMeans handle an init ndarray parameter with missing centroids (available centroids less than n_clusters)?

查看:135
本文介绍了sklearn.cluster.KMeans如何处理缺少质心(可用质心小于n_clusters)的init ndarray参数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在Python中sklearn KMeans(参见文档 ),我想知道将init形状(n,n_features)的ndarray传递给init参数,当n<n_clusters

In Python sklearn KMeans (see documentation), I was wondering what happens internally when passing an ndarray of shape (n, n_features) to the init parameter, When n<n_clusters

  1. 它是否放弃给定的质心并仅启动kmeans ++初始化(这是init参数的默认选择)? ( PDF纸kmeans ++ )(
  1. Does it drop the given centroids and just starts a kmeans++ initialization which is the default choice for the init parameter ? (PDF paper kmeans++) (How does Kmeans++ work)
  2. Does it consider the given centroids and fill accordingly the remaining centroids using kmeans++ ?
  3. Does it consider the given centroids and fill the remaining centroids using random values ?

在这种情况下,我没想到此方法不会返回任何警告.这就是为什么我需要知道它是如何管理的.

I didn't expect that this method returns no warning in this case. That's why I need to know how it manages this.

推荐答案

如果您给它一个不匹配的init,它将调整群集的数量,如您在

If you give it a mismatching init it will adjust the number of clusters, as you can see from the source. This is not documented and I would consider it a bug. I'll propose to fix it.

这篇关于sklearn.cluster.KMeans如何处理缺少质心(可用质心小于n_clusters)的init ndarray参数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆