hdf5文件转换为pandas dataframe [英] hdf5 file to pandas dataframe
问题描述
我下载了一个存储在.h5文件中的数据集. 我只需要保留某些列,并能够处理其中的数据.
I downloaded a dataset which is stored in .h5 files. I need to keep only certain columns and to be able to manipulate the data in it.
为此,我尝试将其加载到pandas数据框中.我尝试使用:
To do this, I tried to load it in a pandas dataframe. I've tried to use:
pd.read_hdf(path)
但是我得到:No dataset in HDF5 file.
我在SO上找到了答案(将HDF5文件读取到熊猫带有条件的DataFrame ),但我不需要条件,答案添加了有关文件编写方式的条件,但我不是文件的创建者,所以我对此无能为力.
I've found answers on SO (read HDF5 file to pandas DataFrame with conditions) but I don't need conditions, and the answer adds conditions about how the file was written but I'm not the creator of the file so I can't do anything about that.
我也尝试过使用h5py:
I've also tried using h5py:
df = h5py.File(path)
但这不是很容易操作,我似乎无法从中删除列(仅使用df.keys()
的列名)
有关如何执行此操作的任何想法?
But this is not easily manipulable and I can't seem to get the columns out of it (only the names of the columns using df.keys()
)
Any idea on how to do this ?
推荐答案
熊猫HDF支持需要非常明确地格式化HDF文件.有关更多信息,请参见 https://stackoverflow.com/a/33644128/4128030 .
Pandas HDF support needs the HDF file to be formated very specifically. You can see https://stackoverflow.com/a/33644128/4128030 for more info.
这篇关于hdf5文件转换为pandas dataframe的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!