输入和输出numpy数组到h5py [英] Input and output numpy arrays to h5py

查看:301
本文介绍了输入和输出numpy数组到h5py的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个Python代码,其输出为尺寸矩阵,其条目均为类型float.如果我将其保存为扩展名.dat,则文件大小约为500 MB.我读到使用h5py会大大减小文件大小.因此,假设我有一个名为A的2D numpy数组.如何将其保存到h5py文件? 另外,由于需要对数组进行操作,我该如何读取同一文件并将其作为numpy数组放入不同的代码中?

解决方案

h5py提供了一个数据集的模型.前者基本上是数组,而后者可以看作目录.每个都被命名.您应该查看API的文档和示例:

http://docs.h5py.org/en/latest/quick.html

一个简单的示例,其中您要先创建所有数据,然后只想将其保存到hdf5文件中,就像这样:

In [1]: import numpy as np
In [2]: import h5py
In [3]: a = np.random.random(size=(100,20))
In [4]: h5f = h5py.File('data.h5', 'w')
In [5]: h5f.create_dataset('dataset_1', data=a)
Out[5]: <HDF5 dataset "dataset_1": shape (100, 20), type "<f8">

In [6]: h5f.close()

然后您可以使用以下方法将数据加载回: '

In [10]: h5f = h5py.File('data.h5','r')
In [11]: b = h5f['dataset_1'][:]
In [12]: h5f.close()

In [13]: np.allclose(a,b)
Out[13]: True

绝对查看文档:

http://docs.h5py.org

写入hdf5文件取决于h5py或pytables(每个文件都具有位于hdf5文件规范之上的不同python API).您还应该看看numpy本地提供的其他简单二进制格式,例如np.savenp.savez等:

http://docs.scipy.org/doc/numpy/reference/routines.io .html

I have a Python code whose output is a sized matrix, whose entries are all of the type float. If I save it with the extension .dat the file size is of the order of 500 MB. I read that using h5py reduces the file size considerably. So, let's say I have the 2D numpy array named A. How do I save it to an h5py file? Also, how do I read the same file and put it as a numpy array in a different code, as I need to do manipulations with the array?

解决方案

h5py provides a model of datasets and groups. The former is basically arrays and the latter you can think of as directories. Each is named. You should look at the documentation for the API and examples:

http://docs.h5py.org/en/latest/quick.html

A simple example where you are creating all of the data upfront and just want to save it to an hdf5 file would look something like:

In [1]: import numpy as np
In [2]: import h5py
In [3]: a = np.random.random(size=(100,20))
In [4]: h5f = h5py.File('data.h5', 'w')
In [5]: h5f.create_dataset('dataset_1', data=a)
Out[5]: <HDF5 dataset "dataset_1": shape (100, 20), type "<f8">

In [6]: h5f.close()

You can then load that data back in using: '

In [10]: h5f = h5py.File('data.h5','r')
In [11]: b = h5f['dataset_1'][:]
In [12]: h5f.close()

In [13]: np.allclose(a,b)
Out[13]: True

Definitely check out the docs:

http://docs.h5py.org

Writing to hdf5 file depends either on h5py or pytables (each has a different python API that sits on top of the hdf5 file specification). You should also take a look at other simple binary formats provided by numpy natively such as np.save, np.savez etc:

http://docs.scipy.org/doc/numpy/reference/routines.io.html

这篇关于输入和输出numpy数组到h5py的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆