在Jupyter Notebook中使用Joblib时不显示打印输出 [英] Printed output not displayed when using joblib in jupyter notebook

查看:729
本文介绍了在Jupyter Notebook中使用Joblib时不显示打印输出的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我正在使用 joblib 并行化一些代码,我注意到在jupyter笔记本中使用时无法打印东西.

So I am using joblib to parallelize some code and I noticed that I couldn't print things when using it inside a jupyter notebook.

我尝试在ipython中使用相同的示例,并且效果很好.

I tried using doing the same example in ipython and it worked perfectly.

这是在jupyter笔记本电脑中编写的最小(非)工作示例

Here is a minimal (not) working example to write in a jupyter notebook cell

from joblib import Parallel, delayed
Parallel(n_jobs=8)(delayed(print)(i) for i in range(10))

所以我得到的输出为[None, None, None, None, None, None, None, None, None, None],但是什么也没打印.

So I am getting the output as [None, None, None, None, None, None, None, None, None, None] but nothing is printed.

我希望看到的(现实中打印顺序可能是随机的):

What I expect to see (print order could be random in reality):

1
2
3
4
5
6
7
8
9
10
[None, None, None, None, None, None, None, None, None, None]

注意:

您可以在笔记本进程的日志中查看打印内容.但是我希望打印在笔记本上进行,而不是在笔记本进程的日志中进行.

Note:

You can see the prints in the logs of the notebook process. But I would like the prints to happen in the notebook, not the logs of the notebook process.

我已经打开了 Github问题,但到目前为止关注程度很小. /p>

I have opened a Github issue, but with minimal attention so far.

推荐答案

我认为,这部分原因是Parallel催生了子工以及Jupyter Notebook如何为这些工处理IO的原因.当启动时未为backend指定值时,Parallel将默认为 loky 利用直接使用fork-exec模型创建子流程的池化策略.

I think this caused in part by the way Parallel spawns the child workers, and how Jupyter Notebook handles IO for those workers. When started without specifying a value for backend, Parallel will default to loky which utilizes a pooling strategy that directly uses a fork-exec model to create the subprocesses.

如果您使用

$ jupyter-notebook

常规stderrstdout流似乎仍然连接到该终端,而笔记本会话将在新的浏览器窗口中启动.在笔记本中运行发布的代码段 会产生预期的输出,但是它似乎到达了stdout并最终到达了终端(如在 Note 中所提示的问题).这进一步证实了这种行为是由loky与笔记本之间的交互作用以及笔记本为子进程处理标准IO流的方式引起的怀疑.

the regular stderr and stdout streams appear to remain attached to that terminal, while the notebook session will start in a new browser window. Running the posted code snippet in the notebook does produce the expected output, but it seems to go to stdout and ends up in the terminal (as hinted in the Note in the question). This further supports the suspicion that this behavior is caused by the interaction between loky and notebook, and the way the standard IO streams are handled by notebook for child processes.

这导致我在github上此讨论(在在发帖的过去两周内),笔记本的作者似乎已经意识到了这一点,但目前似乎还没有针对此问题的明显且快速的解决方案.

This lead me to this discussion on github (active within the past 2 weeks as of this posting) where the authors of notebook appear to be aware of this, but it would seem that there is no obvious and quick fix for the issue at the moment.

如果您不介意切换Parallel用来生成子代的后端,则可以这样做:

If you don't mind switching the backend that Parallel uses to spawn children, you can do so like this:

from joblib import Parallel, delayed
Parallel(n_jobs=8, backend='multiprocessing')(delayed(print)(i) for i in range(10))

使用multiprocessing后端,一切按预期进行. threading看起来也可以正常工作.这可能不是您想要的解决方案,但是希望在笔记本作者努力寻找合适的解决方案时就足够了.

with the multiprocessing backend, things work as expected. threading looks to work fine too. This may not be the solution you were hoping for, but hopefully it is sufficient while the notebook authors work on finding a proper solution.

我将把它交叉发布到GitHub上,以防万一有人想添加到这个答案中(我不想误解任何人的意图或在人们的嘴里说出来!).

I'll cross-post this to GitHub in case anyone there cares to add to this answer (I don't want to misstate anyone's intent or put words in people mouths!).

测试环境:
MacOS-Mojave(10.14)
Python-3.7.3
pip3-19.3.1

Test Environment:
MacOS - Mojave (10.14)
Python - 3.7.3
pip3 - 19.3.1

在2种配置中测试.同时将multiprocessingthreading用作backend参数时,确认产生预期的输出.软件包使用pip3安装.

Tested in 2 configurations. Confirmed to produce the expected output when using both multiprocessing and threading for the backend parameter. Packages install using pip3.

设置1:

ipykernel                               5.1.1
ipython                                 7.5.0
jupyter                                 1.0.0
jupyter-client                          5.2.4
jupyter-console                         6.0.0
jupyter-core                            4.4.0
notebook                                5.7.8

设置2:

ipykernel                               5.1.4
ipython                                 7.12.0
jupyter                                 1.0.0
jupyter-client                          5.3.4
jupyter-console                         6.1.0
jupyter-core                            4.6.2
notebook                                6.0.3

我也成功使用了与设置2"相同的版本,但将notebook软件包的版本降级为6.0.2.

I also was successful using the same versions as 'Setup 2' but with the notebook package version downgraded to 6.0.2.

这种方法在Windows上工作不一致.软件版本的不同组合会产生不同的结果.做最直观的事情-将所有内容升级到最新版本-不能保证它会起作用.

This approach works inconsistently on Windows. Different combinations of software versions yield different results. Doing the most intuitive thing-- upgrading everything to the latest version-- does not guarantee it will work.

这篇关于在Jupyter Notebook中使用Joblib时不显示打印输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆