通过h5py将matlab v7.3文件读入numpy数组的python列表 [英] read matlab v7.3 file into python list of numpy arrays via h5py

查看:59
本文介绍了通过h5py将matlab v7.3文件读入numpy数组的python列表的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道以前有人问过这个问题,但在我看来,仍然没有答案可以解释正在发生的事情,并且碰巧对我的案例不起作用.我有一个结构如下的 matlab v7.3 文件,

I know this has been asked before but in my opinion there are still no answers that explain what is going on and don't happen to work for my case. I have a matlab v7.3 file that is structured like so,

           ---> rank <1x454 cell>    ---> each element is <53x50 double>
   f.mat
           ---> compare <1x454 cell> ---> each element is <53x50 double>

我希望这足够直截了当.所以我想要做的是使用 h5py 库将所有​​ 454 个维度为 53x54 的数组从名为rank"的元胞数组读取到 python 中的 numpy 数组列表中,如下所示:

I hope this is straight forward enough. So what I am trying to do is read all 454 arrays with dimensions 53x54 from the cell array named 'rank', into a list of numpy arrays in python using the h5py library like so:

import h5py

with h5py.File("f.mat") as f:
    data = [np.array(element) for element in f['rank']]

我最终得到的是一个 HDF5 对象引用数组列表:

what I end up with is a list of arrays of HDF5 object references:

In [53]: data[0]
Out[53]: array([<HDF5 object reference>], dtype=object)

我该怎么做/如何获得我需要的数组列表?

What do I do with this / how do I get the list of arrays that I need?

推荐答案

好吧,我找到了问题的解决方案.如果其他人有更好的解决方案或可以更好地解释,我仍然想听听.

Well I found the solution to my problem. If anyone else has a better solution or can better explain I'd still like to hear it.

基本上,需要使用 来索引 h5py 文件对象,以获取被引用的底层数组.在我们引用所需的数组之后,如果只需要数组的一部分,则必须通过使用 [:] 或任何子集对其进行索引来将其加载到内存中.这就是我的意思:

Basically, the <HDF5 object reference> needed to be used to index the h5py file object to get the underlying array that is being referenced. After we are referring to the array that is needed, it has to be loaded to memory by indexing it with [:] or any subset if only part of the array is required. Here is what I mean:

with h5py.File("f.mat") as f:
    data = [f[element[0]][:] for element in f['rank']]

结果:

In [79]: data[0].shape
Out[79]: (50L, 53L)

In [80]: data[0].dtype
Out[80]: dtype('float64')

希望这对未来的任何人都有帮助.我认为这是迄今为止我见过的最通用的解决方案.

Hope this helps anyone in the future. I think this is the most general solution I've seen so far.

这篇关于通过h5py将matlab v7.3文件读入numpy数组的python列表的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆