在Jupyter Notebook中使用Joblib时不显示打印输出 [英] Printed output not displayed when using joblib in jupyter notebook
问题描述
所以我正在使用 joblib 并行化一些代码,我注意到在jupyter笔记本中使用时无法打印东西.
So I am using joblib to parallelize some code and I noticed that I couldn't print things when using it inside a jupyter notebook.
我尝试在ipython中使用相同的示例,并且效果很好.
I tried using doing the same example in ipython and it worked perfectly.
这是在jupyter笔记本电脑中编写的最小(非)工作示例
Here is a minimal (not) working example to write in a jupyter notebook cell
from joblib import Parallel, delayed
Parallel(n_jobs=8)(delayed(print)(i) for i in range(10))
所以我得到的输出为[None, None, None, None, None, None, None, None, None, None]
,但是什么也没打印.
So I am getting the output as [None, None, None, None, None, None, None, None, None, None]
but nothing is printed.
我希望看到的(现实中打印顺序可能是随机的):
What I expect to see (print order could be random in reality):
1
2
3
4
5
6
7
8
9
10
[None, None, None, None, None, None, None, None, None, None]
注意:
您可以在笔记本进程的日志中查看打印内容.但是我希望打印在笔记本上进行,而不是在笔记本进程的日志中进行.
Note:
You can see the prints in the logs of the notebook process. But I would like the prints to happen in the notebook, not the logs of the notebook process.
我已经打开了 Github问题,但到目前为止关注程度很小. /p>
I have opened a Github issue, but with minimal attention so far.
推荐答案
我认为,这部分原因是Parallel
催生了子工以及Jupyter Notebook如何为这些工处理IO的原因.当启动时未为backend
指定值时,Parallel
将默认为 loky
利用直接使用fork-exec模型创建子流程的池化策略.
I think this caused in part by the way Parallel
spawns the child workers, and how Jupyter Notebook handles IO for those workers. When started without specifying a value for backend
, Parallel
will default to loky
which utilizes a pooling strategy that directly uses a fork-exec model to create the subprocesses.
如果您使用
$ jupyter-notebook
常规stderr
和stdout
流似乎仍然连接到该终端,而笔记本会话将在新的浏览器窗口中启动.在笔记本中运行发布的代码段 会产生预期的输出,但是它似乎到达了stdout
并最终到达了终端(如在 Note 中所提示的问题).这进一步证实了这种行为是由loky
与笔记本之间的交互作用以及笔记本为子进程处理标准IO流的方式引起的怀疑.
the regular stderr
and stdout
streams appear to remain attached to that terminal, while the notebook session will start in a new browser window. Running the posted code snippet in the notebook does produce the expected output, but it seems to go to stdout
and ends up in the terminal (as hinted in the Note in the question). This further supports the suspicion that this behavior is caused by the interaction between loky
and notebook, and the way the standard IO streams are handled by notebook for child processes.
这导致我在github上此讨论(在在发帖的过去两周内),笔记本的作者似乎已经意识到了这一点,但目前似乎还没有针对此问题的明显且快速的解决方案.
This lead me to this discussion on github (active within the past 2 weeks as of this posting) where the authors of notebook appear to be aware of this, but it would seem that there is no obvious and quick fix for the issue at the moment.
如果您不介意切换Parallel
用来生成子代的后端,则可以这样做:
If you don't mind switching the backend that Parallel
uses to spawn children, you can do so like this:
from joblib import Parallel, delayed
Parallel(n_jobs=8, backend='multiprocessing')(delayed(print)(i) for i in range(10))
使用multiprocessing
后端,一切按预期进行. threading
看起来也可以正常工作.这可能不是您想要的解决方案,但是希望在笔记本作者努力寻找合适的解决方案时就足够了.
with the multiprocessing
backend, things work as expected. threading
looks to work fine too. This may not be the solution you were hoping for, but hopefully it is sufficient while the notebook authors work on finding a proper solution.
我将把它交叉发布到GitHub上,以防万一有人想添加到这个答案中(我不想误解任何人的意图或在人们的嘴里说出来!).
I'll cross-post this to GitHub in case anyone there cares to add to this answer (I don't want to misstate anyone's intent or put words in people mouths!).
测试环境:
MacOS-Mojave(10.14)
Python-3.7.3
pip3-19.3.1
Test Environment:
MacOS - Mojave (10.14)
Python - 3.7.3
pip3 - 19.3.1
在2种配置中测试.同时将multiprocessing
和threading
用作backend
参数时,确认产生预期的输出.软件包使用pip3
安装.
Tested in 2 configurations. Confirmed to produce the expected output when using both multiprocessing
and threading
for the backend
parameter. Packages install using pip3
.
设置1:
ipykernel 5.1.1
ipython 7.5.0
jupyter 1.0.0
jupyter-client 5.2.4
jupyter-console 6.0.0
jupyter-core 4.4.0
notebook 5.7.8
设置2:
ipykernel 5.1.4
ipython 7.12.0
jupyter 1.0.0
jupyter-client 5.3.4
jupyter-console 6.1.0
jupyter-core 4.6.2
notebook 6.0.3
我也成功使用了与设置2"相同的版本,但将notebook
软件包的版本降级为6.0.2.
I also was successful using the same versions as 'Setup 2' but with the notebook
package version downgraded to 6.0.2.
这种方法在Windows上工作不一致.软件版本的不同组合会产生不同的结果.做最直观的事情-将所有内容升级到最新版本-不能保证它会起作用.
This approach works inconsistently on Windows. Different combinations of software versions yield different results. Doing the most intuitive thing-- upgrading everything to the latest version-- does not guarantee it will work.
这篇关于在Jupyter Notebook中使用Joblib时不显示打印输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!