h5py,在SVHN中访问数据集中的数据 [英] h5py, access data in Datasets in SVHN

查看:69
本文介绍了h5py,在SVHN中访问数据集中的数据的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想通过使用h5py

I want to read the Street View House Numbers (SVHN) Dataset by using h5py

In [117]: def printname(name):
     ...:     print(name)
     ...:

In [118]: data['/digitStruct'].visit(printname)
bbox
name

数据中有两个组,bboxnamename是对应于文件名数据的组名,而bbox是对应于宽度,高度,顶部,左侧和标签数据.

There are two group in the data, bbox and name, name is the group name corresponding to the file name data, and bbox is the group name corresponding to the width, height, top, left and label data.

如何访问namebbox组中的所有数据?

How can I visit all the data in name and bbox group?

我尝试使用文档中的以下代码,但是它只是返回HDF5对象引用.

I have tried with the following code from the Docs, but it just return HDF5 object reference.

In [119]: for i in data['/digitStruct/name']:
     ...:     print(i[0])
     ...:
     ...:
<HDF5 object reference>
<HDF5 object reference>
<HDF5 object reference>
<HDF5 object reference>
<HDF5 object reference>
<HDF5 object reference>

Python版本:3.5和操作系统:Windows 10.

Python version: 3.5 and OS: Windows 10.

推荐答案

我将在这里回答我的问题,在阅读h5py的文档后,这是我的代码

I'll answer my question here, after read the docs of h5py, here is my code

def get_box_data(index, hdf5_data):
    """
    get `left, top, width, height` of each picture
    :param index:
    :param hdf5_data:
    :return:
    """
    meta_data = dict()
    meta_data['height'] = []
    meta_data['label'] = []
    meta_data['left'] = []
    meta_data['top'] = []
    meta_data['width'] = []

    def print_attrs(name, obj):
        vals = []
        if obj.shape[0] == 1:
            vals.append(obj[0][0])
        else:
            for k in range(obj.shape[0]):
                vals.append(int(hdf5_data[obj[k][0]][0][0]))
        meta_data[name] = vals

    box = hdf5_data['/digitStruct/bbox'][index]
    hdf5_data[box[0]].visititems(print_attrs)
    return meta_data

def get_name(index, hdf5_data):
    name = hdf5_data['/digitStruct/name']
    return ''.join([chr(v[0]) for v in hdf5_data[name[index][0]].value])

hdf5_datatrain_data = h5py.File('./train/digitStruct.mat'),可以正常工作!

以下是使用上述两个功能的示例代码

Here is some sample code to use the above two functions

mat_data = h5py.File(os.path.join(folder, 'digitStruct.mat'))
size = mat_data['/digitStruct/name'].size

for _i in tqdm.tqdm(range(size)):
    pic = get_name(_i, mat_data)
    box = get_box_data(_i, mat_data)

上面的函数显示了如何获取数据每个条目的名称和bbox数据!

The above function shows how to get the name and the bbox data of each entry of the data!

这篇关于h5py,在SVHN中访问数据集中的数据的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆