Python scikit-learn KMeans在计算轮廓分数时被杀死(9) [英] Python scikit-learn KMeans is being killed (9) while computing silhouette score

查看:77
本文介绍了Python scikit-learn KMeans在计算轮廓分数时被杀死(9)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前正在处理一个图像数据集(250 000张图像,所以与特征向量一样多,每个向量都由132个特征组成),并尝试使用sklearn提供的KMeans函数.

I'm currently working on an image dataset (250 000 images, so just as much as features vectors, everyone of them composed of 132 features) and trying to use the KMeans function provided by sklearn.

我在Mac OS X 10.10,Python 2.7和sklearn 0.15.2上运行它,过了一会儿我只得到一个:

I run it on Mac OS X 10.10, Python 2.7 and sklearn 0.15.2, and after a while I only obtain a:

被杀:9

Killed: 9

运行这些命令行时出错:

Error when running these command lines:

nb_cls = int(raw_input("Number of clusters chosen :"))
clusterer = sklearn.cluster.KMeans(n_clusters=nb_cls)
clusters_labels = clusterer.fit_predict(X)
silhouette = sklearn.metrics.silhouette_score(X, clusters_labels)
print "n clusters =", nb_cls, "/ silhouette_score =", silhouette

请注意,在计算轮廓分数时,代码不会被杀死

对于较小的数据集(±2500张图像),相同的算法是有效的,并且不会出现此类Python错误.

For smaller datasets (± 2 500 images) the same algorithm is efficient and there is no such Python error.

如何避免出现Killed 9错误?这个计算对我的笔记本电脑来说是否太雄心勃勃?

How could I avoid this Killed 9 error? Is this calculation too ambitious for my laptop?

推荐答案

这意味着您的脚本已被操作系统杀死.在大多数情况下,这是因为它使用了过多的内存.在您的情况下,当您仅使用2500张图像时,代码可以正常工作,这似乎很有可能.

It means your script was killed by the OS. In most cases it's because it was using too much memory. It seems likely in your case as your code works fine when you use only 2 500 images.

如果这是内存问题,您将不得不获得更多的RAM(在Mac上是不可能的),使用具有更多RAM的另一台计算机或减小数据集的大小.

If it is a memory problem, you will have to either get more RAM (not possible on a mac ?), use another computer with more RAM or reduce the size of the dataset.

这篇关于Python scikit-learn KMeans在计算轮廓分数时被杀死(9)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆