Sklearn kmeans 等效于肘部方法 [英] Sklearn kmeans equivalent of elbow method

查看:88
本文介绍了Sklearn kmeans 等效于肘部方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我正在检查多达 10 个集群,我通常使用 scipy 生成肘部"图,如下所示:

Let's say I'm examining up to 10 clusters, with scipy I usually generate the 'elbow' plot as follows:

from scipy import cluster
cluster_array = [cluster.vq.kmeans(my_matrix, i) for i in range(1,10)]

pyplot.plot([var for (cent,var) in cluster_array])
pyplot.show()

从那以后,我开始有动力使用 sklearn 进行聚类,但是我不确定如何像在 scipy 案例中那样创建绘图所需的数组.我最好的猜测是:

I have since became motivated to use sklearn for clustering, however I'm not sure how to create the array needed to plot as in the scipy case. My best guess was:

from sklearn.cluster import KMeans

km = [KMeans(n_clusters=i) for i range(1,10)]
cluster_array = [km[i].fit(my_matrix)]

不幸的是,这导致了无效的命令错误.sklearn 解决此问题的最佳方法是什么?

That unfortunately resulted in an invalid command error. What is the best way sklearn way to go about this?

谢谢

推荐答案

您在代码中遇到了一些语法问题.现在应该修复它们:

You had some syntax problems in the code. They should be fixed now:

Ks = range(1, 10)
km = [KMeans(n_clusters=i) for i in Ks]
score = [km[i].fit(my_matrix).score(my_matrix) for i in range(len(km))]

fit 方法只返回一个 self 对象.在原代码中的这一行

The fit method just returns a self object. In this line in the original code

cluster_array = [km[i].fit(my_matrix)]

cluster_array 最终会与 km 具有相同的内容.

the cluster_array would end up having the same contents as km.

您可以使用 score 方法来估计聚类拟合的程度.要查看每个集群的分数,只需运行 plot(Ks, score).

You can use the score method to get the estimate for how well the clustering fits. To see the score for each cluster simply run plot(Ks, score).

这篇关于Sklearn kmeans 等效于肘部方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆