scikit-learn的k均值:预测方法实际上有什么作用? [英] scikit-learn's k-means: what does the predict method really do?
问题描述
当我使用scikit-learn的k-means实现时,我通常只调用fit()
方法,这足以获取聚类中心和标签. predict()
方法用于计算标签,甚至为方便起见也可以使用fit_predict()
方法,但是如果我只能使用fit()
获得标签,那么predict()
方法的目的是什么?
When I use scikit-learn's implementation of k-means I usually just call the fit()
method and that is enough to get the cluster centers and the labels. The predict()
method is used to calculate the labels and even a fit_predict()
method is available for convenience, but if I can get the labels only using fit()
, what is the purpose of the predict()
method?
推荐答案
predict
可以用于看不见的数据.当k均值用于半监督学习中的特征提取时,此方法(以及transform
方法更有用):将大量样本聚类,然后将最接近的质心/距离作为质心作为后续监督学习的特征问题.将结果用于预测时,您会得到k均值未看到的样本.
predict
, as @EdChum suggested, can be used on unseen data. This (and more so, the transform
method) is useful when k-means is used for feature extraction in semisupervised learning: you cluster a large set of samples, then use nearest centroid/distance to centroids as features for a subsequent supervised learning problem. When using the result for prediction, you get samples that were not seen by k-means.
这篇关于scikit-learn的k均值:预测方法实际上有什么作用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!