scikit-learn的k均值:预测方法实际上有什么作用? [英] scikit-learn's k-means: what does the predict method really do?

查看:134
本文介绍了scikit-learn的k均值:预测方法实际上有什么作用?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我使用scikit-learn的k-means实现时,我通常只调用fit()方法,这足以获取聚类中心和标签. predict()方法用于计算标签,甚至为方便起见也可以使用fit_predict()方法,但是如果我只能使用fit()获得标签,那么predict()方法的目的是什么?

When I use scikit-learn's implementation of k-means I usually just call the fit() method and that is enough to get the cluster centers and the labels. The predict() method is used to calculate the labels and even a fit_predict() method is available for convenience, but if I can get the labels only using fit(), what is the purpose of the predict() method?

推荐答案

predict可以用于看不见的数据.当k均值用于半监督学习中的特征提取时,此方法(以及transform方法更有用):将大量样本聚类,然后将最接近的质心/距离作为质心作为后续监督学习的特征问题.将结果用于预测时,您会得到k均值未看到的样本.

predict, as @EdChum suggested, can be used on unseen data. This (and more so, the transform method) is useful when k-means is used for feature extraction in semisupervised learning: you cluster a large set of samples, then use nearest centroid/distance to centroids as features for a subsequent supervised learning problem. When using the result for prediction, you get samples that were not seen by k-means.

这篇关于scikit-learn的k均值:预测方法实际上有什么作用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆