是否可以使用多处理对一个h5py文件进行并行读取? [英] Is it possible to do parallel reads on one h5py file using multiprocessing?

查看：557 发布时间：2020/5/24 21:26:53 python parallel-processing h5py

本文介绍了是否可以使用多处理对一个h5py文件进行并行读取?的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在尝试加快从h5py数据集文件中读取块(将它们加载到RAM内存中)的过程.现在，我尝试通过多处理库执行此操作.

I am trying to speed up the process of reading chunks (load them into RAM memory) out of a h5py dataset file. Right now I try to do this via the multiprocessing library.

pool = mp.Pool(NUM_PROCESSES)
gen = pool.imap(loader, indices)

加载器功能如下:

def loader(indices):
    with h5py.File("location", 'r') as dataset:
        x = dataset["name"][indices]

这有时有时是可行的(这意味着预期的加载时间除以进程数，从而实现并行化).但是，在大多数情况下，它不是，并且加载时间仅与顺序加载数据时一样长.有什么我可以解决的吗?我知道h5py支持通过mpi4py进行并行读/写，但是我只想知道这是否对于仅读也是绝对必要的.

This actually sometimes works (meaning that the expected loading time is divided by the number of processes and thus parallelized). However, most of the time it doesn't and the loading time just stays as high as it was when loading the data sequentially. Is there anything I can do to fix this? I know h5py supports parallel read/writes through mpi4py but I would just want to know if that is absolutely necessary for only reads as well.

是否可以使用多处理对一个h5py文件进行并行读取? [英] Is it possible to do parallel reads on one h5py file using multiprocessing?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

是否可以使用多处理对一个h5py文件进行并行读取? [英] Is it possible to do parallel reads on one h5py file using multiprocessing?

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭