将matlab数据结构读入numpy数组 [英] Reading matlab data structure into numpy arrays
问题描述
我有一组 MAT 文件,其中包含一个 matlab struct
.该结构有一堆数组.我想打开文件并将它们全部传输到数组中.到目前为止,我已经编写了以下代码:
I have a set of MAT-files which contains a matlab struct
. The struct has bunch of arrays. I would like to open the file and transfer all of them into arrays. So far I have written the following code:
import h5py
>>> fs = h5py.File('statistics_VAD.mat','r')
>>> list(fs.keys())
['#refs#', 'data']
>>>
>>> fs['data'].visititems(lambda n,o:print(n, o))
C <HDF5 dataset "C": shape (100, 1), type "|O">
P <HDF5 dataset "P": shape (100, 1), type "|O">
V <HDF5 dataset "V": shape (100, 1), type "|O">
Wn <HDF5 dataset "Wn": shape (100, 1), type "|O">
X <HDF5 dataset "X": shape (100, 1), type "|O">
a <HDF5 dataset "a": shape (100, 1), type "|O">
dn <HDF5 dataset "dn": shape (100, 1), type "|O">
>>> struArray = fs['data']
>>> print(struArray['P'])
<HDF5 dataset "P": shape (100, 1), type "|O">
我不知道如何将 HDF5 数据集P"
传输到 numpy
数组.任何建议将不胜感激
I don't know how to transfer HDF5 dataset "P"
to a numpy
array. Any suggestion would be appreciated
推荐答案
下面的代码是我的评论 (dtd 2021-03-01) 中提到的示例.它从 NumPy 数组创建 2 个数据集,然后是一个具有 2 个对象引用的数据集,每个数据集 1 个.然后展示了如何使用对象引用来访问数据.为完整性起见,还完成了带有区域参考的第二个数据集.
Code below is the example mentioned in my comment (dtd 2021-03-01). It creates 2 datasets from NumPy arrays, then a dataset with 2 object references, 1 to each dataset. It then shows how to use the object references to access the data. A second dataset with region references is also done for completeness.
注意 h5f[]
如何被使用两次:内部一个获取对象,外部一个从对象引用中获取数据.这是一种微妙的方式,让不熟悉引用的用户感到困惑.
Notice how h5f[]
is used twice: the inner one gets the object, and the outer one gets the data from the object reference. It's a subtlety that trips users new to references.
import numpy as np
import h5py
with h5py.File('SO_66410592.h5','w') as h5f :
# Create 2 datasets using numpy arrays
arr = np.arange(100).reshape(20,5)
h5f.create_dataset('array1',data=arr)
arr = np.arange(100,0,-1).reshape(20,5)
h5f.create_dataset('array2',data=arr)
# Create a dataset of OBJECT references:
h5f.create_dataset('O_refs', (10,), dtype=h5py.ref_dtype)
h5f['O_refs'][0] = h5f['array1'].ref
print (h5f['O_refs'][0])
print (h5f[ h5f['O_refs'][0] ])
print (h5f[ h5f['O_refs'][0] ][0,:])
h5f['O_refs'][1] = h5f['array2'].ref
print (h5f['O_refs'][1])
print (h5f[ h5f['O_refs'][1] ])
print (h5f[ h5f['O_refs'][1] ][-1,:])
# Create a dataset of REGION references:
h5f.create_dataset('R_refs', (10,), dtype=h5py.regionref_dtype)
h5f['R_refs'][0] = h5f['array1'].regionref[0,:]
print (h5f['R_refs'][0])
print (h5f[ h5f['R_refs'][0] ])
print (h5f[ h5f['R_refs'][0] ] [ h5f['R_refs'][0] ])
h5f['R_refs'][1] = h5f['array2'].regionref[-1,:]
print (h5f['R_refs'][1])
print (h5f[ h5f['R_refs'][1] ])
print (h5f[ h5f['R_refs'][1] ] [ h5f['R_refs'][1] ])
这篇关于将matlab数据结构读入numpy数组的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!