列出HDF5组中的数据集 [英] Listing datasets in a group in HDF5

查看：213 发布时间：2020/11/22 19:14:57 python hdf5

本文介绍了列出HDF5组中的数据集的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我决定使用其分层结构将数据存储在HDF5中，而不是依赖于文件系统. 不幸的是，我遇到了性能问题.

I decided to store my data in HDF5 using its hierarchical structure instead of relying on the filesystem. Unfortunately, I'm having performance issues.

我的数据格式如下: 我有大约70个顶级组，分别对应于日期，每个组包含大约8000个数据集. 我想查看每天的数据集数量的列表:

My data is formatted as follows: I have about 70 top level groups, corresponding to dates and each of them contain roughly 8000 datasets. I would like to see a list of the number of datasets per day:

for date in hdf5.keys():
   print(len(hdf5[date]))

每次迭代需要2秒以上的时间，我感到有些沮丧.

I'm finding it a little frustrating that this takes 2+ second/iteration.

另外，我有两个具有上述布局的hdf5文件，而更大的文件则慢得多.

Also, I have two different hdf5 files with the above layout and the bigger one is much slower at this.

我在做什么错了?

推荐答案

尝试使用libver最新标记创建文件:

Try creating the file with the libver latest flag:

f = h5py.File('name.hdf5', libver='latest')

如果每个组有很多数据集或每个数据集有很多属性，则速度会更快.

This will be much faster if you have a lot of datasets per group or attributes per dataset.

这篇关于列出HDF5组中的数据集的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

列出HDF5组中的数据集 [英] Listing datasets in a group in HDF5

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

列出HDF5组中的数据集 [英] Listing datasets in a group in HDF5

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭