如何克服numpy.unique的MemoryError [英] How to overcome MemoryError of numpy.unique

查看:97
本文介绍了如何克服numpy.unique的MemoryError的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用Numpy版本1.11.1,并且必须处理二维数组

I am using Numpy version 1.11.1 and have to deal with an two-dimensional array of

my_arr.shape = (25000, 25000)

所有值都是整数,我需要一个唯一的数组值列表。当使用 lst = np.unique(my_arr)时,我得到:

All values are integer, and I need a unique list of the arrays values. When using lst = np.unique(my_arr) I am getting:

Traceback (most recent call last):
  File "<pyshell#38>", line 1, in <module>
    palette = np.unique(arr)
  File "c:\Python27\lib\site-packages\numpy\lib\arraysetops.py", line 176, in unique
    ar = np.asanyarray(ar).flatten()
MemoryError

我的机器只有8 GB的RAM,但是我在另一台具有16 GB RAM的机器上尝试过,结果是相同的。监视内存和CPU使用率并不表明问题与RAM或CPU有关。

My machine has only 8 GB RAM, but I tried it with another machine with 16 GB RAM, and the result is the same. Monitoring the memory and CPU usage doesn't show that the problems are related to RAM or CPU.

原则上,我知道数组所包含的值,但是如果输入发生了变化...另外,如果我想用另一个替换数组的值(假设所有2都用0代替),它还需要很多RAM吗?

In principle, I know the values the array consists of, but what if the input changes... Also, if I want to replace values of the array by another (let's say all 2 by 0), will it need a lot of RAM as well?

推荐答案

Python 32位不能访问超过4 GiB RAM(通常〜2.5 GiB)。显而易见的答案是使用64位版本。如果那不起作用,另一种解决方案是使用 numpy.memmap 并将该内存映射到存储在磁盘上的文件中。

Python 32-bit can't access more than 4 GiB RAM (often ~2.5 GiB). The obvious answer would be to use the 64-bit version. If that doesn't work, another solution would be to use numpy.memmap and memory-map the array into a file stored on disk.

这篇关于如何克服numpy.unique的MemoryError的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆