使用python中的h5py调整和保存.h5格式的数据集 [英] Resizing and storing dataset in .h5 format using h5py in python
问题描述
我正在尝试使用python中的h5py
包来调整数据集的大小并存储新值.我的数据集大小在每个实例中都在不断增加,我想使用resize
函数附加.h5
文件.但是,我使用自己的方法遇到了错误.变量dset
是数据集的数组.
I am trying to resize dataset and store new values using h5py
package in python. My dataset size keeps increasing at every time instance, and I would like to append the .h5
file using the resize
function. However, I run into errors using my approach. The variable dset
is an array of datasets.
import os
import h5py
import numpy as np
path = './out.h5'
os.remove(path)
def create_h5py(path):
with h5py.File(path, "a") as hf:
grp = hf.create_group('left')
dset = []
dset.append(grp.create_dataset('voltage', (10**4,3), maxshape=(None,3), dtype='f', chunks=(10**4,3)))
dset.append(grp.create_dataset('current', (10**4,3), maxshape=(None,3), dtype='f', chunks=(10**4,3)))
return dset
if __name__ == '__main__':
dset = create_h5py(path)
for i in range(3):
if i == 0:
dset[0][:] = np.random.random(dset[0].shape)
dset[1][:] = np.random.random(dset[1].shape)
else:
dset[0].resize(dset[0].shape[0]+10**4, axis=0)
dset[0][-10**4:] = np.random.random((10**4,3))
dset[1].resize(dset[1].shape[0]+10**4, axis=0)
dset[1][-10**4:] = np.random.random((10**4,3))
编辑
感谢 tel 我能够解决此问题.将with h5py.File(path, "a") as hf:
替换为hf = h5py.File(path, "a")
.
Thanks to tel I was able to solve this. Replace with h5py.File(path, "a") as hf:
with hf = h5py.File(path, "a")
.
推荐答案
@tel提供了解决该问题的简便方法.我在他的回答下方的评论中概述了一种更简单的方法.对于初学者来说,编码(和理解)更简单.基本上,它对@Maxtron的原始代码进行了一些小的更改.修改为:
@tel provided an elegant solution to the problem. I outlined a simpler approach in my comments below his answer. It is simpler for a beginner to code (and understand). Basically, it there a few minor changes to @Maxtron's original code. Modifications are:
- 将
with h5py.File(path, "a") as hf:
移至__main__
例程 - 通过
create_h5py(hf)
中的hf
- 我还在
os.remove()
之前添加了一个测试,以避免h5文件出错 不存在
- move
with h5py.File(path, "a") as hf:
to__main__
routine - pass
hf
increate_h5py(hf)
- I also added a test before
os.remove()
to avoid errors if the h5 file doesn't exist
我建议的以下修改内容:
My suggested modifications below:
import h5py, os
import numpy as np
path = './out.h5'
# test existence of H5 file before deleting
if os.path.isfile(path):
os.remove(path)
def create_h5py(hf):
grp = hf.create_group('left')
dset = []
dset.append(grp.create_dataset('voltage', (10**4,3), maxshape=(None,3), dtype='f', chunks=(10**4,3)))
dset.append(grp.create_dataset('current', (10**4,3), maxshape=(None,3), dtype='f', chunks=(10**4,3)))
return dset
if __name__ == '__main__':
with h5py.File(path, "a") as hf:
dset = create_h5py(hf)
for i in range(3):
if i == 0:
dset[0][:] = np.random.random(dset[0].shape)
dset[1][:] = np.random.random(dset[1].shape)
else:
dset[0].resize(dset[0].shape[0]+10**4, axis=0)
dset[0][-10**4:] = np.random.random((10**4,3))
dset[1].resize(dset[1].shape[0]+10**4, axis=0)
dset[1][-10**4:] = np.random.random((10**4,3))
这篇关于使用python中的h5py调整和保存.h5格式的数据集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!