调整numpy.memmap阵列 [英] Resizing numpy.memmap arrays

查看:2627
本文介绍了调整numpy.memmap阵列的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我跟一帮大numpy的阵列的工作,并作为这些开始咀嚼最近太多的记忆,我想用 numpy.memmap 来替换它们的实例。问题是,现在,然后我必须调整阵列,而且我preferably做到就地。这相当奏效与普通阵列,但在尝试上memmaps抱怨,该数据可能会被共享,甚至禁用refcheck没有帮助。

I'm working with a bunch of large numpy arrays, and as these started to chew up too much memory lately, I wanted to replace them with numpy.memmap instances. The problem is, now and then I have to resize the arrays, and I'd preferably do that inplace. This worked quite well with ordinary arrays, but trying that on memmaps complains, that the data might be shared, and even disabling the refcheck does not help.

a = np.arange(10)
a.resize(20)
a
>>> array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

a = np.memmap('bla.bin', dtype=int)
a
>>> memmap([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

a.resize(20, refcheck=False)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-41-f1546111a7a1> in <module>()
----> 1 a.resize(20, refcheck=False)

ValueError: cannot resize this array: it does not own its data

调整底层的mmap缓冲区工作完全正常。问题是如何反映这些变化给数组对象。我已经看到了这个 的解决办法,但遗憾的是它并没有调整阵列到位。还有关于变更mmaps一些 numpy的文档,但它显然不是工作,至少在1.8.0版本。任何其他的想法,如何覆盖内置的调整检查?

Resizing the underlying mmap buffer works perfectly fine. The problem is how to reflect these changes to the array object. I've seen this workaround, but unfortunately it doesn't resize the array in place. There is also some numpy documentation about resizing mmaps, but it's clearly not working, at least with version 1.8.0. Any other ideas, how to override the inbuilt resizing checks?

推荐答案

的问题是,当你创建阵列的标志OWNDATA为False。您可以更改通过要求标志为True,当你创建数组:

The issue is that the flag OWNDATA is False when you create your array. You can change that by requiring the flag to be True when you create the array:

>>> a = np.require(np.memmap('bla.bin', dtype=int), requirements=['O'])
>>> a.shape
(10,)
>>> a.flags
  C_CONTIGUOUS : True
  F_CONTIGUOUS : True
  OWNDATA : True
  WRITEABLE : True
  ALIGNED : True
  UPDATEIFCOPY : False
>>> a.resize(20, refcheck=False)
>>> a.shape
(20,)

唯一需要注意的是,它可能创建数组,然后复印一份,以确保满足要求。

The only caveat is that it may create the array and make a copy to be sure the requirements are met.

修改,以解决节能:

如果您想保存重新大小的数组到磁盘上,可以保存MEMMAP为.npy格式的文件并打开一个 numpy.memmap 当你需要重新打开它作为一个MEMMAP是:

If you want to save the re-sized array to disk, you can save the memmap as a .npy formatted file and open as a numpy.memmap when you need to re-open it and use as a memmap:

>>> a[9] = 1
>>> np.save('bla.npy',a)
>>> b = np.lib.format.open_memmap('bla.npy', dtype=int, mode='r+')
>>> b
memmap([0, 9, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

修改提供另一种方法:

您可能会接近你被重新调整大小基本MMAP(a.base或a._mmap,存放在UINT8格式)和重装寻找什么MEMMAP:

You may get close to what you're looking for by re-sizing the base mmap (a.base or a._mmap, stored in uint8 format) and "reloading" the memmap:

>>> a = np.memmap('bla.bin', dtype=int)
>>> a
memmap([0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
>>> a[3] = 7
>>> a
memmap([0, 0, 0, 7, 0, 0, 0, 0, 0, 0])
>>> a.flush()
>>> a = np.memmap('bla.bin', dtype=int)
>>> a
memmap([0, 0, 0, 7, 0, 0, 0, 0, 0, 0])
>>> a.base.resize(20*8)
>>> a.flush()
>>> a = np.memmap('bla.bin', dtype=int)
>>> a
memmap([0, 0, 0, 7, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])

这篇关于调整numpy.memmap阵列的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆