Sklearn kmeans 等效于肘部方法 [英] Sklearn kmeans equivalent of elbow method
问题描述
假设我正在检查多达 10 个集群,我通常使用 scipy 生成肘部"图,如下所示:
Let's say I'm examining up to 10 clusters, with scipy I usually generate the 'elbow' plot as follows:
from scipy import cluster
cluster_array = [cluster.vq.kmeans(my_matrix, i) for i in range(1,10)]
pyplot.plot([var for (cent,var) in cluster_array])
pyplot.show()
从那以后,我开始有动力使用 sklearn 进行聚类,但是我不确定如何像在 scipy 案例中那样创建绘图所需的数组.我最好的猜测是:
I have since became motivated to use sklearn for clustering, however I'm not sure how to create the array needed to plot as in the scipy case. My best guess was:
from sklearn.cluster import KMeans
km = [KMeans(n_clusters=i) for i range(1,10)]
cluster_array = [km[i].fit(my_matrix)]
不幸的是,这导致了无效的命令错误.sklearn 解决此问题的最佳方法是什么?
That unfortunately resulted in an invalid command error. What is the best way sklearn way to go about this?
谢谢
推荐答案
您在代码中遇到了一些语法问题.现在应该修复它们:
You had some syntax problems in the code. They should be fixed now:
Ks = range(1, 10)
km = [KMeans(n_clusters=i) for i in Ks]
score = [km[i].fit(my_matrix).score(my_matrix) for i in range(len(km))]
fit
方法只返回一个 self
对象.在原代码中的这一行
The fit
method just returns a self
object. In this line in the original code
cluster_array = [km[i].fit(my_matrix)]
cluster_array
最终会与 km
具有相同的内容.
the cluster_array
would end up having the same contents as km
.
您可以使用 score
方法来估计聚类拟合的程度.要查看每个集群的分数,只需运行 plot(Ks, score)
.
You can use the score
method to get the estimate for how well the clustering fits. To see the score for each cluster simply run plot(Ks, score)
.
这篇关于Sklearn kmeans 等效于肘部方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!