Sklearn.KMeans:如何避免内存或值错误? [英] Sklearn.KMeans : how to avoid Memory or Value Error?

查看：117 发布时间：2020/4/26 10:22:06 python memory scikit-learn k-means

本文介绍了Sklearn.KMeans:如何避免内存或值错误?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在研究图像分类问题，并且正在创建一个单词袋模型.为此，我提取了所有图像的SIFT描述符，然后必须使用KMeans算法找到要用作单词袋的中心.

I'm working on an image classification problem and I'm creating a bag of words model. To do that, I extracted the SIFT descriptors of all my images and I have to use the KMeans algorithm to find the centers to use as my bag of words.

这是我的数据:

图片数量:1584
SIFT描述符的数量(32个元素的向量):571685
中心数目:15840

所以我运行了一个KMeans算法来计算我的中心:

So I ran a KMeans algorithm to compute my centers:

dico = pickle.load(open('./dico.bin', 'rb')) # np.shape(dico) = (571685, 32)
k = np.size(os.listdir(img_path)) * 10 # = 1584 * 10

kmeans = KMeans(n_clusters=k, n_init=1, verbose=1).fit(dico)

pickle.dump(kmeans, open('./kmeans.bin', 'wb'))
pickle.dump(kmeans.cluster_centers_, open('./dico_reduit.bin', 'wb'))

使用此代码，我遇到了内存错误，因为笔记本电脑上没有足够的内存(只有2GB)，所以我决定将中心数除以2，并随机选择一半的SIFT描述符.这次，我得到了Value Error : array is too big.

With this code, I got a Memory Error because I don't have enough memory on my laptop (only 2GB) so I decided to divide by 2 the number of center and to choose randomly half of my SIFT descriptors. This time, I got Value Error : array is too big.

我该怎么做才能在没有内存问题的情况下获得相关结果?

What can I do to get a relevant result without memory problem?

推荐答案

就像@sascha在此评论中所说，我只需要使用 MiniBatchKMeans 类来避免此问题:

As @sascha said in this comment, I just have to use MiniBatchKMeans class to avoid this problem:

dico = pickle.load(open('./dico.bin', 'rb'))

batch_size = np.size(os.listdir(img_path)) * 3
kmeans = MiniBatchKMeans(n_clusters=k, batch_size=batch_size, verbose=1).fit(dico)

pickle.dump(kmeans, open('./minibatchkmeans.bin', 'wb'))

这篇关于Sklearn.KMeans:如何避免内存或值错误?的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

Sklearn.KMeans:如何避免内存或值错误? [英] Sklearn.KMeans : how to avoid Memory or Value Error?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Sklearn.KMeans:如何避免内存或值错误? [英] Sklearn.KMeans : how to avoid Memory or Value Error?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭