python读取的复杂Matlab结构Mat文件 [英] Complex Matlab struct mat file read by python

查看:245
本文介绍了python读取的复杂Matlab结构Mat文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道Mat文件的版本问题,这些文件对应于python中的不同加载模块,即scipy.ioh5py.我还搜索了许多类似的问题,例如 scipy .io.loadmat嵌套结构(即字典)在python中访问时如何保留Matlab结构?.但是当涉及到更复杂的mat文件时,它们都失败了.我的anno_bbox.mat文件结构如下所示:

I know the version issues of mat files which correspond to different loading modules in python, namely scipy.io and h5py. I also searched a lot of similar problems like scipy.io.loadmat nested structures (i.e. dictionaries) and How to preserve Matlab struct when accessing in python?. But they both fail when it comes to more complex mat files. My anno_bbox.mat file structure is shown as follows:

前两个级别:

在尺寸上:

在hoi:

在hoi bboxhuman中:

In the hoi bboxhuman:

当我使用spio.loadmat('anno_bbox.mat', struct_as_record=False, squeeze_me=True)时,它只能获得第一级信息作为字典.

When I use spio.loadmat('anno_bbox.mat', struct_as_record=False, squeeze_me=True), it could only get the first level information as a dictionary.

>>> anno_bbox.keys()
dict_keys(['__header__', '__version__', '__globals__', 'bbox_test', 
'bbox_train', 'list_action'])
>>> bbox_test = anno_bbox['bbox_test']
>>> bbox_test.keys()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'numpy.ndarray' object has no attribute 'keys'
>>> bbox_test
array([<scipy.io.matlab.mio5_params.mat_struct object at 0x7fa8660ab128>,
   <scipy.io.matlab.mio5_params.mat_struct object at 0x7fa8660ab2b0>,
   <scipy.io.matlab.mio5_params.mat_struct object at 0x7fa8660ab710>,
   ...,
   <scipy.io.matlab.mio5_params.mat_struct object at 0x7fa8622ec4a8>,
   <scipy.io.matlab.mio5_params.mat_struct object at 0x7fa8622ecb00>,
   <scipy.io.matlab.mio5_params.mat_struct object at 0x7fa8622f1198>], dtype=object)

我不知道下一步该怎么做.对我来说太复杂了.该文件位于 anno_bbox.mat (8.7MB)

I don't know what to do next. It is too complicated for me. The file is available at anno_bbox.mat (8.7MB)

推荐答案

我明白了(在这种情况下,使用共享文件是一个好主意):

I get (working from the shared file is a good idea on this case):

加载方式:

data = io.loadmat('../Downloads/anno_bbox.mat')

我得到:

In [96]: data['bbox_test'].dtype
Out[96]: dtype([('filename', 'O'), ('size', 'O'), ('hoi', 'O')])
In [97]: data['bbox_test'].shape
Out[97]: (1, 9658)

我本可以分配bbox_test=data['bbox_test'].此变量具有9658个记录,其中包含三个字段,每个字段的对象为dtype.

I could have assigned bbox_test=data['bbox_test']. This variable has 9658 records, with three fields, each with object dtype.

因此有一个文件名(嵌入在1个元素数组中的字符串)

So there's a filename (a string embedded in a 1 element array)

In [101]: data['bbox_test'][0,0]['filename']
Out[101]: array(['HICO_test2015_00000001.jpg'], dtype='<U26')

size具有3个字段,其中3个数字嵌入到数组中(二维Matlab矩阵):

size has 3 fields, with 3 numbers embedded in arrays (2d matlab matrices):

In [102]: data['bbox_test'][0,0]['size']
Out[102]: 
array([[(array([[640]], dtype=uint16), array([[427]], dtype=uint16), array([[3]], dtype=uint8))]],
      dtype=[('width', 'O'), ('height', 'O'), ('depth', 'O')])
In [112]: data['bbox_test'][0,0]['size'][0,0].item()
Out[112]: 
(array([[640]], dtype=uint16),
 array([[427]], dtype=uint16),
 array([[3]], dtype=uint8))

hoi更复杂:

In [103]: data['bbox_test'][0,0]['hoi']
Out[103]: 
array([[(array([[246]], dtype=uint8), array([[(array([[320]], dtype=uint16), array([[359]], dtype=uint16), array([[306]], dtype=uint16), array([[349]], dtype=uint16)),...
      dtype=[('id', 'O'), ('bboxhuman', 'O'), ('bboxobject', 'O'), ('connection', 'O'), ('invis', 'O')])


In [126]: data['bbox_test'][0,1]['hoi']['id']
Out[126]: 
array([[array([[132]], dtype=uint8), array([[140]], dtype=uint8),
        array([[144]], dtype=uint8)]], dtype=object)
In [130]: data['bbox_test'][0,1]['hoi']['bboxhuman'][0,0]
Out[130]: 
array([[(array([[226]], dtype=uint8), array([[340]], dtype=uint16), array([[18]], dtype=uint8), array([[210]], dtype=uint8))]],
      dtype=[('x1', 'O'), ('x2', 'O'), ('y1', 'O'), ('y2', 'O')])

因此,您在MATLAB结构中显示的数据全都以数组的嵌套结构(通常为2d(1,1)形状),对象dtype或多个字段存在.

So the data that you show in the MATLAB structures is all there, in a nested structure of arrays (often 2d (1,1) shape), object dtype or multiple fields.

返回并使用squeeze_me进行加载,我得到了一个简单的结果:

Going back and loading with squeeze_me I get a simpler:

In [133]: data['bbox_test'][1]['hoi']['bboxhuman']
Out[133]: 
array([array((226, 340, 18, 210),
      dtype=[('x1', 'O'), ('x2', 'O'), ('y1', 'O'), ('y2', 'O')]),
       array((230, 356, 19, 212),
      dtype=[('x1', 'O'), ('x2', 'O'), ('y1', 'O'), ('y2', 'O')]),
       array((234, 342, 13, 202),
      dtype=[('x1', 'O'), ('x2', 'O'), ('y1', 'O'), ('y2', 'O')])],
      dtype=object)

有了struct_as_record='False',我得到了

In [136]: data['bbox_test'][1]
Out[136]: <scipy.io.matlab.mio5_params.mat_struct at 0x7f90841e9748>

查看此rec的属性,我看到我可以通过属性名称访问字段":

Looking at the attributes of this rec I see I can access 'fields' by attribute name:

In [137]: rec = data['bbox_test'][1]
In [138]: rec.filename
Out[138]: 'HICO_test2015_00000002.jpg'
In [139]: rec.size
Out[139]: <scipy.io.matlab.mio5_params.mat_struct at 0x7f90841e9b38>

In [141]: rec.size.width
Out[141]: 640
In [142]: rec.hoi
Out[142]: 
array([<scipy.io.matlab.mio5_params.mat_struct object at 0x7f90841e9be0>,
       <scipy.io.matlab.mio5_params.mat_struct object at 0x7f90841e9e10>,
       <scipy.io.matlab.mio5_params.mat_struct object at 0x7f90841ee0b8>],
      dtype=object)

In [145]: rec.hoi[1].bboxhuman
Out[145]: <scipy.io.matlab.mio5_params.mat_struct at 0x7f90841e9f98>
In [146]: rec.hoi[1].bboxhuman.x1
Out[146]: 230

In [147]: vars(rec.hoi[1].bboxhuman)
Out[147]: 
{'_fieldnames': ['x1', 'x2', 'y1', 'y2'],
 'x1': 230,
 'x2': 356,
 'y1': 19,
 'y2': 212}

以此类推.

这篇关于python读取的复杂Matlab结构Mat文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆