Pytorch数据加载器，线程过多，CPU内存分配过多 [英] Pytorch dataloader, too many threads, too much cpu memory allocation

查看：925 发布时间：2020/10/7 19:09:32 computer-vision pytorch dataloader

本文介绍了Pytorch数据加载器，线程过多，CPU内存分配过多的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我正在使用PyTorch训练模型。要加载数据，我使用的是 torch.utils.data.DataLoader 。数据加载器正在使用我已实现的自定义数据库。发生一个奇怪的问题，每次执行以下代码中的第二个 for 时，线程/进程数增加，并且分配了大量的内存

I'm training a model using PyTorch. To load the data, I'm using torch.utils.data.DataLoader. The data loader is using a custom database I've implemented. A strange problem has occurred, every time the second for in the following code executes, the number of threads/processes increases and a huge amount of memory is allocated

    for epoch in range(start_epoch, opt.niter + opt.niter_decay + 1):
        epoch_start_time = time.time()
        if epoch != start_epoch:
            epoch_iter = epoch_iter % dataset_size
        for i, item in tqdm(enumerate(dataset, start=epoch_iter)):

我怀疑在每次<code> __ iter __（）之后没有释放以前迭代器的线程和内存调用数据加载器。
创建线程时，分配的内存接近于主线程/进程分配的内存量。也就是说，在初始时期，主线程正在使用2GB的内存，因此创建了2个大小为2GB的线程。在下一个时期，主线程分配5GB内存，并构造两个5GB线程（ num_workers 为2）。
我怀疑 fork（）函数会将大部分上下文复制到新线程中。

I suspect the threads and memories of the previous iterators are not released after each __iter__() call to the data loader. The allocated memory is close to the amount of memory allocated by the main thread/process when the threads are created. That is in the initial epoch the main thread is using 2GB of memory and so 2 threads of size 2GB are created. In the next epochs, 5GB of memory is allocated by the main thread and two 5GB threads are constructed (num_workers is 2). I suspect that fork() function copies most of the context to the new threads.

下面是活动监视器，显示了由python创建的进程， ZMQbg / 1 是与python相关的进程。

The following is the Activity monitor showing the processes created by python, ZMQbg/1 are processes related to python.

数据加载程序使用的我的数据集有100个子数据集，即 __ getitem __ 调用随机选择一个（忽略索引）。（子数据集是pix2pixHD GitHub存储库中的 AlignedDataset ）：

My dataset used by the data loader has 100 sub-datasets, the __getitem__ call randomly selects one (ignoring the index). (the sub-datasets are AlignedDataset from pix2pixHD GitHub repository):

Pytorch数据加载器，线程过多，CPU内存分配过多 [英] Pytorch dataloader, too many threads, too much cpu memory allocation

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录关闭

Pytorch数据加载器，线程过多，CPU内存分配过多 [英] Pytorch dataloader, too many threads, too much cpu memory allocation

问题描述

推荐答案

相关文章

其他开发最新文章

热门教程

热门工具

登录 关闭

登录关闭