尝试在内存映射模式下将cupy.load较大尺寸的.npy文件进行Cupy OutOfMemoryError时,但np.load可以正常工作 [英] Cupy OutOfMemoryError when trying to cupy.load larger dimension .npy files in memory map mode, but np.load works fine

查看:298
本文介绍了尝试在内存映射模式下将cupy.load较大尺寸的.npy文件进行Cupy OutOfMemoryError时,但np.load可以正常工作的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用内存映射模式在Cupy中加载一些更大的.npy文件,但我一直遇到OutOfMemoryError.

I'm trying to load some larger .npy files in cupy with memory mapped mode, but I keep running into OutOfMemoryError .

我认为,由于它是在内存映射模式下打开的,因此此操作不应占用太多内存,因为内存映射实际上并未将整个数组加载到内存中.

I thought that since it's being opened in memory mapped mode, this operation shouldn't take much memory since a memory map doesn't actually load the whole array into memory.

我可以用np.load很好地加载这些文件,这似乎只发生在cupy.load中.我的环境是带有Cosla K80 GPU的Google Colab.它具有约12 gigs CPU ram,12 gigs GPU ram和350 gb磁盘空间.

I can load these files with np.load just fine, this only seems to happen with cupy.load. My enviroment is Google Colab, with the Tesla K80 GPU. It has about 12 gigs CPU ram, 12 gigs GPU ram, and 350 gb disk space.

这是重现该错误的最小示例:

Here is a minimal example to reproduce the error:

import os
import numpy as np
import cupy

#Create .npy files. 
for i in range(4):
    numpyMemmap = np.memmap( 'reg.memmap'+str(i), dtype='float32', mode='w+', shape=( 10000000 , 128 ))
    np.save( 'reg.memmap'+str(i) , numpyMemmap )
    del numpyMemmap
    os.remove( 'reg.memmap'+str(i) )

# Check if they load correctly with np.load.
NPYmemmap = []
for i in range(4):
    NPYmemmap.append( np.load( 'reg.memmap'+str(i)+'.npy' , mmap_mode = 'r+' )  )
del NPYmemmap

# Eventually results in memory error. 
CPYmemmap = []
for i in range(4):
    print(i)
    CPYmemmap.append( cupy.load( 'reg.memmap'+str(i)+'.npy' , mmap_mode = 'r+' )  )

输出:

0
1
/usr/local/lib/python3.6/dist-packages/cupy/creation/from_data.py:41: UserWarning: Using synchronous transfer as pinned memory (5120000000 bytes) could not be allocated. This generally occurs because of insufficient host memory. The original error was: cudaErrorMemoryAllocation: out of memory
  return core.array(obj, dtype, copy, order, subok, ndmin)
2
3
---------------------------------------------------------------------------
OutOfMemoryError                          Traceback (most recent call last)
<ipython-input-4-b5c849e2adba> in <module>()
      2 for i in range(4):
      3     print(i)
----> 4     CPYmemmap.append( cupy.load( 'reg.memmap'+str(i)+'.npy' , mmap_mode = 'r+' )  )

1 frames
/usr/local/lib/python3.6/dist-packages/cupy/io/npz.py in load(file, mmap_mode)
     47     obj = numpy.load(file, mmap_mode)
     48     if isinstance(obj, numpy.ndarray):
---> 49         return cupy.array(obj)
     50     elif isinstance(obj, numpy.lib.npyio.NpzFile):
     51         return NpzFile(obj)

/usr/local/lib/python3.6/dist-packages/cupy/creation/from_data.py in array(obj, dtype, copy, order, subok, ndmin)
     39 
     40     """
---> 41     return core.array(obj, dtype, copy, order, subok, ndmin)
     42 
     43 

cupy/core/core.pyx in cupy.core.core.array()

cupy/core/core.pyx in cupy.core.core.array()

cupy/core/core.pyx in cupy.core.core.ndarray.__init__()

cupy/cuda/memory.pyx in cupy.cuda.memory.alloc()

cupy/cuda/memory.pyx in cupy.cuda.memory.MemoryPool.malloc()

cupy/cuda/memory.pyx in cupy.cuda.memory.MemoryPool.malloc()

cupy/cuda/memory.pyx in cupy.cuda.memory.SingleDeviceMemoryPool.malloc()

cupy/cuda/memory.pyx in cupy.cuda.memory.SingleDeviceMemoryPool._malloc()

cupy/cuda/memory.pyx in cupy.cuda.memory._try_malloc()

OutOfMemoryError: out of memory to allocate 5120000000 bytes (total 20480000000 bytes)

我还想知道这是否与Google Colab及其环境/GPU有关.

I am also wondering if this is perhaps related to Google Colab and their enviroment/GPU.

为方便起见,这是此最小代码的Google Colab笔记本

For convenience, here is a Google Colab notebook of this minimal code

https://colab.research.google.com/drive/12uPL-ZnKhGTJifZGVdTN7e8qBRRus4t

推荐答案

用于内存映射的磁盘文件的numpy.load机制可能不需要将整个文件从磁盘加载到主机内存中.

The numpy.load mechanism for a disk file when memory-mapped may not require the entire file to be loaded from disk into host memory.

但是,似乎 cupy.load 将要求整个文件先放入主机内存,然后再放入设备内存.

However it appears that cupy.load will require that the entire file fit first in host memory, then in device memory.

您的特定测试用例似乎正在创建4个磁盘文件,每个磁盘文件的大小约为5GB.如果每个主机或设备的内存都为12GB,则它们都无法容纳在主机或设备内存中.因此,我希望第三次加载文件时会失败,甚至更早.

Your particular test case appears to be creating 4 disk files of ~5GB size each. These won't all fit in either host or device memory if you have 12GB of each. Therefore I would expect things to fail on the 3rd file load, if not earlier.

可能可以将numpy.load机制用于映射的内存,然后通过cupy操作有选择地将部分数据移至GPU.在这种情况下,对于像cupy数组这样的常见情况,GPU上的数据大小仍将限制为GPU RAM.

It may be possible to use your numpy.load mechanism with mapped memory, and then selectively move portions of that data to the GPU with cupy operations. In that case, the data size on the GPU would still be limited to GPU RAM, for the usual things like cupy arrays.

即使您可以使用CUDA固定的零复制"内存,它仍将限于主机内存大小(此处为12GB)或更小.

Even if you could used CUDA pinned "zero-copy" memory, it would still be limited to the host memory size (12GB, here) or less.

这篇关于尝试在内存映射模式下将cupy.load较大尺寸的.npy文件进行Cupy OutOfMemoryError时,但np.load可以正常工作的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆