如何将多个numpy文件放入一个大的numpy文件中而不会出现内存错误? [英] How to put many numpy files in one big numpy file without having memory error?

查看:74
本文介绍了如何将多个numpy文件放入一个大的numpy文件中而不会出现内存错误?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我关注这个问题追加多个numpy文件转换为python中的一个大numpy文件,以便将多个numpy文件放入一个大文件中,结果是:

I follow this question Append multiple numpy files to one big numpy file in python in order to put many numpy files in one big file, the result is:

import matplotlib.pyplot as plt 
import numpy as np
import glob
import os, sys
fpath ="path_Of_my_final_Big_File"
npyfilespath ="path_of_my_numpy_files"   
os.chdir(npyfilespath)
npfiles= glob.glob("*.npy")
npfiles.sort()
all_arrays = np.zeros((166601,8000))
for i,npfile in enumerate(npfiles):
    all_arrays[i]=np.load(os.path.join(npyfilespath, npfile))
np.save(fpath, all_arrays)
data = np.load(fpath)
print data
print data.shape

我有成千上万个文件,通过使用此代码,我始终会遇到内存错误,因此无法获得结果文件.如何解决这个错误?如何按文件读取,写入和附加最终的numpy文件?

I have thousands of files, by using this code, I have always a memory error, so I can't have my result file. How to resolve this error? How to read, write and append int the final numpy file by file, ?

推荐答案

尝试查看

来自文档:

内存映射文件用于访问磁盘上大文件的小片段,而无需将整个文件读入内存.

Memory-mapped files are used for accessing small segments of large files on disk, without reading the entire file into memory.

您将能够访问所有阵列,但是操作系统将负责加载您实际需要的部分.仔细阅读文档页面,请注意,从性能的角度来看,您可以决定是按列还是按行存储文件.

You will be able to access all the array, but the operating system will take care of loading the part that you actually need. Read carefully the documentation page and note that from the performance point of view you can decide whether the file should be stored column-wise or row-wise.

这篇关于如何将多个numpy文件放入一个大的numpy文件中而不会出现内存错误?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆