通过h5py将matlab v7.3文件读入numpy数组的python列表中 [英] read matlab v7.3 file into python list of numpy arrays via h5py

查看:316
本文介绍了通过h5py将matlab v7.3文件读入numpy数组的python列表中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道以前已经有人问过这个问题,但我认为仍然没有答案可以说明正在发生的事情,并且对我的案子没有任何帮助.我有一个matlab v7.3文件,其结构如下,

I know this has been asked before but in my opinion there are still no answers that explain what is going on and don't happen to work for my case. I have a matlab v7.3 file that is structured like so,

           ---> rank <1x454 cell>    ---> each element is <53x50 double>
   f.mat
           ---> compare <1x454 cell> ---> each element is <53x50 double>

我希望这很简单.所以我想做的是使用h5py库从名为"rank"的单元格数组中读取尺寸为53x54的所有454个数组,到python中的numpy数组列表中,如下所示:

I hope this is straight forward enough. So what I am trying to do is read all 454 arrays with dimensions 53x54 from the cell array named 'rank', into a list of numpy arrays in python using the h5py library like so:

import h5py

with h5py.File("f.mat") as f:
    data = [np.array(element) for element in f['rank']]

最后我得到的是HDF5对象引用数组的列表:

what I end up with is a list of arrays of HDF5 object references:

In [53]: data[0]
Out[53]: array([<HDF5 object reference>], dtype=object)

该如何处理/如何获取所需的数组列表?

What do I do with this / how do I get the list of arrays that I need?

推荐答案

好了,我找到了解决问题的方法.如果还有其他人有更好的解决方案或可以更好地解释,我还是很想听听.

Well I found the solution to my problem. If anyone else has a better solution or can better explain I'd still like to hear it.

基本上,需要使用<HDF5 object reference>为h5py文件对象建立索引,以获取被引用的基础数组.在引用了所需的数组之后,如果只需要数组的一部分,则必须使用[:]或任何子集对其进行索引,从而将其加载到内存中.这是我的意思:

Basically, the <HDF5 object reference> needed to be used to index the h5py file object to get the underlying array that is being referenced. After we are referring to the array that is needed, it has to be loaded to memory by indexing it with [:] or any subset if only part of the array is required. Here is what I mean:

with h5py.File("f.mat") as f:
    data = [f[element[0]][:] for element in f['rank']]

和结果:

In [79]: data[0].shape
Out[79]: (50L, 53L)

In [80]: data[0].dtype
Out[80]: dtype('float64')

希望这对以后的所有人都有帮助.我认为这是到目前为止我所见过的最通用的解决方案.

Hope this helps anyone in the future. I think this is the most general solution I've seen so far.

这篇关于通过h5py将matlab v7.3文件读入numpy数组的python列表中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆