使用python中的h5py调整和保存.h5格式的数据集 [英] Resizing and storing dataset in .h5 format using h5py in python

查看:472
本文介绍了使用python中的h5py调整和保存.h5格式的数据集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用python中的h5py包来调整数据集的大小并存储新值.我的数据集大小在每个实例中都在不断增加,我想使用resize函数附加.h5文件.但是,我使用自己的方法遇到了错误.变量dset是数据集的数组.

I am trying to resize dataset and store new values using h5py package in python. My dataset size keeps increasing at every time instance, and I would like to append the .h5 file using the resize function. However, I run into errors using my approach. The variable dset is an array of datasets.

import os
import h5py
import numpy as np

path = './out.h5'
os.remove(path)

def create_h5py(path):
    with h5py.File(path, "a") as hf:
        grp = hf.create_group('left')
        dset = []
        dset.append(grp.create_dataset('voltage', (10**4,3), maxshape=(None,3), dtype='f', chunks=(10**4,3)))
        dset.append(grp.create_dataset('current', (10**4,3), maxshape=(None,3), dtype='f', chunks=(10**4,3)))
        return dset

if __name__ == '__main__':
    dset = create_h5py(path)
    for i in range(3):

        if i == 0:
            dset[0][:] = np.random.random(dset[0].shape) 
            dset[1][:] = np.random.random(dset[1].shape)
        else:
            dset[0].resize(dset[0].shape[0]+10**4, axis=0)
            dset[0][-10**4:] = np.random.random((10**4,3))
            dset[1].resize(dset[1].shape[0]+10**4, axis=0)
            dset[1][-10**4:] = np.random.random((10**4,3))

编辑

感谢 tel 我能够解决此问题.将with h5py.File(path, "a") as hf:替换为hf = h5py.File(path, "a").

Thanks to tel I was able to solve this. Replace with h5py.File(path, "a") as hf: with hf = h5py.File(path, "a").

推荐答案

@tel提供了解决该问题的简便方法.我在他的回答下方的评论中概述了一种更简单的方法.对于初学者来说,编码(和理解)更简单.基本上,它对@Maxtron的原始代码进行了一些小的更改.修改为:

@tel provided an elegant solution to the problem. I outlined a simpler approach in my comments below his answer. It is simpler for a beginner to code (and understand). Basically, it there a few minor changes to @Maxtron's original code. Modifications are:

  • with h5py.File(path, "a") as hf:移至__main__例程
  • 通过create_h5py(hf)中的hf
  • 我还在os.remove()之前添加了一个测试,以避免h5文件出错 不存在
  • move with h5py.File(path, "a") as hf: to __main__ routine
  • pass hf in create_h5py(hf)
  • I also added a test before os.remove() to avoid errors if the h5 file doesn't exist

我建议的以下修改内容:

My suggested modifications below:

import h5py, os
import numpy as np

path = './out.h5'
# test existence of H5 file before deleting
if  os.path.isfile(path):
    os.remove(path)

def create_h5py(hf):
    grp = hf.create_group('left')
    dset = []
    dset.append(grp.create_dataset('voltage', (10**4,3), maxshape=(None,3), dtype='f', chunks=(10**4,3)))
    dset.append(grp.create_dataset('current', (10**4,3), maxshape=(None,3), dtype='f', chunks=(10**4,3)))
    return dset

if __name__ == '__main__':

    with h5py.File(path, "a") as hf:
        dset = create_h5py(hf)
        for i in range(3):

            if i == 0:
                dset[0][:] = np.random.random(dset[0].shape) 
                dset[1][:] = np.random.random(dset[1].shape)
            else:
                dset[0].resize(dset[0].shape[0]+10**4, axis=0)
                dset[0][-10**4:] = np.random.random((10**4,3))
                dset[1].resize(dset[1].shape[0]+10**4, axis=0)
                dset[1][-10**4:] = np.random.random((10**4,3))

这篇关于使用python中的h5py调整和保存.h5格式的数据集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆