使用不同大小的h5py阵列进行保存 [英] Saving with h5py arrays of different sizes

查看:290
本文介绍了使用不同大小的h5py阵列进行保存的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用HDF5数据格式存储大约3000个numpy数组.数组的长度从5306到121999 np.float64

I am trying to store about 3000 numpy arrays using HDF5 data format. Arrays vary in length from 5306 to 121999 np.float64

我越来越 Object dtype dtype('O') has no native HDF5 equivalent 错误,因为由于数据numpy的不规则性质而使用了通用对象类.

I am getting Object dtype dtype('O') has no native HDF5 equivalent error since due to the irregular nature of the data numpy uses the general object class.

我的想法是将所有数组填充到121999的长度,并将大小存储在另一个数据集中.

My idea was to pad all the arrays to 121999 length and storing the sizes in another dataset.

但是,这似乎在空间上效率很低,有没有更好的方法?

However this seems quite inefficient in space, is there a better way?

为明确起见,我想存储3126个dtype = np.float64数组.我将它们存储在list中,当h5py执行例程时,由于它们的长度不同,它会转换为dtype = object数组.为了说明这一点:

To clarify, I want to store 3126 arrays of dtype = np.float64. I have them stored in a listand when h5py does the routine it converts to an array of dtype = object because they are different lengths. To illustrate it:

a = np.array([0.1,0.2,0.3],dtype=np.float64)
b = np.array([0.1,0.2,0.3,0.4,0.5],dtype=np.float64)
c = np.array([0.1,0.2],dtype=np.float64)

arrs = np.array([a,b,c]) # This is performed inside the h5py call
print(arrs.dtype)
>>> object
print(arrs[0].dtype)
>>> float64

推荐答案

看起来像您尝试了以下操作:

Looks like you tried something like:

In [364]: f=h5py.File('test.hdf5','w')    
In [365]: grp=f.create_group('alist')

In [366]: grp.create_dataset('alist',data=[a,b,c])
...
TypeError: Object dtype dtype('O') has no native HDF5 equivalent

但是,相反,如果您将数组另存为单独的数据集,则它可以工作:

But if instead you save the arrays as separate datasets it works:

In [367]: adict=dict(a=a,b=b,c=c)

In [368]: for k,v in adict.items():
    grp.create_dataset(k,data=v)
   .....:     

In [369]: grp
Out[369]: <HDF5 group "/alist" (3 members)>

In [370]: grp['a'][:]
Out[370]: array([ 0.1,  0.2,  0.3])

并访问组中的所有数据集:

and to access all the datasets in the group:

In [389]: [i[:] for i in grp.values()]
Out[389]: 
[array([ 0.1,  0.2,  0.3]),
 array([ 0.1,  0.2,  0.3,  0.4,  0.5]),
 array([ 0.1,  0.2])]

这篇关于使用不同大小的h5py阵列进行保存的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆