numpy如何处理mmap的over npz文件? [英] How does numpy handle mmap's over npz files?
问题描述
我有一种情况,我想使用mmap模式打开一个压缩的numpy文件,但似乎找不到任何有关其如何在后台工作的文档.例如,是否将压缩文件解压缩到内存中然后映射?它会即时解压缩吗?
I have a case where I would like to open a compressed numpy file using mmap mode, but can't seem to find any documentation about how it will work under the covers. For example, will it decompress the archive in memory and then mmap it? Will it decompress on the fly?
缺少该配置的文档.
推荐答案
基于代码,简短的答案是归档和压缩(无论使用np.savez
还是gzip
)与访问文件不兼容在mmap_mode
中.这不仅关乎如何完成,而且还关乎是否能做到.
The short answer, based on looking at the code, is that archiving and compression, whether using np.savez
or gzip
, is not compatible with accessing files in mmap_mode
. It's not just a matter of how it is done, but whether it can be done at all.
np.load
函数中的相关位
elif isinstance(file, gzip.GzipFile):
fid = seek_gzip_factory(file)
...
if magic.startswith(_ZIP_PREFIX):
# zip-file (assume .npz)
# Transfer file ownership to NpzFile
tmp = own_fid
own_fid = False
return NpzFile(fid, own_fid=tmp)
...
if mmap_mode:
return format.open_memmap(file, mode=mmap_mode)
查看np.lib.npyio.NpzFile
. npz
文件是.npy
文件的ZIP存档.它会加载字典(如)对象,并且仅在您访问单个变量(数组)(例如obj[key]). There's no provision in its code for opening those individual files in
mmap_mode`.
Look at np.lib.npyio.NpzFile
. An npz
file is a ZIP archive of .npy
files. It loads a dictionary(like) object, and only loads the individual variables (arrays) when you access them (e.g. obj[key]). There's no provision in its code for opening those individual files in
mmap_mode`.
很明显,用np.savez
创建的文件不能作为mmap访问. ZIP归档和压缩与np.load
前面介绍的gzip压缩不同.
It's pretty obvious that a file created with np.savez
cannot be accessed as mmap. The ZIP archiving and compression is not the same as the gzip compression addressed earlier in the np.load
.
但是用np.save
然后是gzipped
保存的单个阵列又如何呢?请注意,format.open_memmap
是用file
而不是fid
(可能是gzip文件)调用的.
But what of a single array saved with np.save
and then gzipped
? Note that format.open_memmap
is called with file
, not fid
(which might be a gzip file).
np.lib.npyio.format
中有关open_memmap
的更多详细信息.它的第一个测试是file
必须是字符串,而不是现有文件fid.最后将工作委派给np.memmap
.我没有看到该功能中对gzip
的任何规定.
More details on open_memmap
in np.lib.npyio.format
. Its first test is that file
must be a string, not an existing file fid. It ends up delegating the work to np.memmap
. I don't see any provision in that function for gzip
.
这篇关于numpy如何处理mmap的over npz文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!