在spyder中使用Windows 10的Keras fit_generator()中关于多处理和工作者的混淆 [英] Confusion about multiprocessing and workers in Keras fit_generator() with windows 10 in spyder

查看:154
本文介绍了在spyder中使用Windows 10的Keras fit_generator()中关于多处理和工作者的混淆的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在fit_generator()的文档中(文档: https://keras.io/models/sequential /#fit_generator ),它表示参数use_multiprocessing接受一个布尔值,如果将其设置为True,则该布尔值允许基于进程的线程.

In the documentation for fit_generator() (docs: https://keras.io/models/sequential/#fit_generator) it says that the parameter use_multiprocessing accepts a bool that if set to True allows process-based threading.

它也表示参数worker是一个整数,它指定使用基于进程的线程时要启动的进程数.显然,它默认为1(基于单个进程的线程),如果设置为0,它将在主线程上执行生成器.

It also says that the parameter workers is an integer that designates how many process to spin up if using process-based threading. Apparently it defaults to 1 (a single process based thread) and if set to 0 it will execute the generator on the main thread.

我认为这意味着,如果use_multiprocessing = True且workers> 0(以6为例),它将启动6个独立运行生成器的进程.但是,当我对此进行测试时,我认为我一定会误会某些东西(见下文).

What I thought this meant was that if use_multiprocessing=True and workers > 0 (let's use 6 for an example) that it would spin up 6 processes running the generator independently. However, when I test this I think I must be misunderstanding something (see below).

我感到困惑的是,如果我将use_multiprocessing设置为False并且worker = 1,那么在任务管理器中,我可以看到我的所有12个虚拟内核都被平均地利用,而我的CPU使用率约为50%,训练我的模型(作为参考,我有一个i6-8750H CPU,它具有6个支持虚拟化的内核,并且在BIOS中启用了虚拟化).如果增加工人数量,CPU使用率将达到100%,并且培训速度会更快.如果将工作程序的数量减少到0以便它在主线程上运行,我可以看到我的所有虚拟内核仍在使用中,但是似乎有些不平衡,CPU使用率约为36%.

My confusion arises from the fact that if I set use_multiprocessing to False and workers = 1 then in my task manager I can see that all 12 of my virtual cores are being utilized somewhat evenly and I am at about 50% CPU usage while training my model (for reference, I have an i7-8750H CPU with 6 cores that support virtualization and I have virtualization enabled in BIOS). If I increase the number of workers, the CPU usage goes to 100% and training is much faster. If I decrease the number of workers to 0 so that it runs on the main thread, I can see that all of my virtual cores are still being used, but it seems somewhat uneven and CPU usage is at about 36%.

不幸的是,如果我将multiprocessing设置为True,那么我将得到一个坏管道错误.我尚未解决此问题,但我想更好地了解我要在此处解决的问题.

Unfortunately, if I set multiprocessing = True, then I get a brokenpipe error. I have yet to fix this, but I'd like to better understand what I am trying to fix here.

如果有人可以解释使用use_multiprocessing = True和use_multiprocessing = False进行训练之间的区别,以及当worker为0、1和> 1时,我将不胜感激.如果有问题,我将使用tensorflow(gpu版本)作为IPython控制台的Spyder中python 3.6的keras后端.

If someone could please explain the difference between training with use_multiprocessing = True and use_multiprocessing = False, as well as when workers are = 0, 1, and >1 I would be very grateful. If it matters, I am using tensorflow (gpu version) as the backend for keras with python 3.6 in Spyder with the IPython Console.

我的怀疑是,当True时use_multiprocessing实际上启用了多处理,而当use_multiprocessing = False时的worker> 1正在设置线程数,但这只是一个猜测.

My suspicion is that use_multiprocessing is actually enabling multiprocessing when True whereas workers>1 when use_multiprocessing=False is setting the number of threads, but that's just a guess.

推荐答案

我唯一了解的是,当use_multiprocessing=Falseworkers > 1时,有许多并行数据加载线程(我对这些名称的理解不是很好) ,线程,进程等).但是有五个并行的前端将数据加载到队列中(因此,加载数据的速度更快,但不会影响模型的速度-当数据加载时间太长时,这可能会很好).

The only thing I know is that when use_multiprocessing=False and workers > 1, there are many parallel data loading threads (I'm not really good with these names, threads, processes, etc.). But there are five parallel fronts loading data to the queue (so, loading data is faster, but it doesn't affect the model's speed - this can be good when data loading takes too long).

每当我尝试use_multiprocessing=True时,所有内容都会冻结.

Whenever I tried use_multiprocessing=True, everything got frozen.

这篇关于在spyder中使用Windows 10的Keras fit_generator()中关于多处理和工作者的混淆的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆