在 jupyter notebook 中使用 joblib 时不显示打印输出 [英] Printed output not displayed when using joblib in jupyter notebook

查看:135
本文介绍了在 jupyter notebook 中使用 joblib 时不显示打印输出的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我使用 joblib 来并行化一些代码,我注意到在 jupyter 笔记本中使用它时我无法打印东西.

So I am using joblib to parallelize some code and I noticed that I couldn't print things when using it inside a jupyter notebook.

我尝试在 ipython 中使用相同的示例,效果很好.

I tried using doing the same example in ipython and it worked perfectly.

这是在 jupyter notebook cell 中编写的最小(非)工作示例

Here is a minimal (not) working example to write in a jupyter notebook cell

from joblib import Parallel, delayed
Parallel(n_jobs=8)(delayed(print)(i) for i in range(10))

所以我得到的输出为 [None, None, None, None, None, None, None, None, None, None] 但没有打印任何内容.

So I am getting the output as [None, None, None, None, None, None, None, None, None, None] but nothing is printed.

我希望看到的(打印顺序在现实中可能是随机的):

What I expect to see (print order could be random in reality):

1
2
3
4
5
6
7
8
9
10
[None, None, None, None, None, None, None, None, None, None]

注意:

您可以在笔记本进程的日志中看到打印件.但我希望打印发生在笔记本中,而不是笔记本进程的日志中.

Note:

You can see the prints in the logs of the notebook process. But I would like the prints to happen in the notebook, not the logs of the notebook process.

我已经打开了一个 Github 问题,但到目前为止很少有人关注.

I have opened a Github issue, but with minimal attention so far.

推荐答案

我认为这部分是由 Parallel 产生子 worker 的方式以及 Jupyter Notebook 如何为这些 worker 处理 IO 造成的.当启动时没有为 backend 指定值,Parallel 将默认为 loky 它利用池化策略,直接使用 fork-exec 模型来创建子进程.

I think this caused in part by the way Parallel spawns the child workers, and how Jupyter Notebook handles IO for those workers. When started without specifying a value for backend, Parallel will default to loky which utilizes a pooling strategy that directly uses a fork-exec model to create the subprocesses.

如果您使用

$ jupyter-notebook

常规的 stderrstdout 流似乎仍然连接到该终端,而笔记本会话将在新的浏览器窗口中启动.在笔记本中运行发布的代码片段确实会产生预期的输出,但它似乎转到 stdout 并在终端中结束(如 注意 在问题中).这进一步支持了这种行为的怀疑,即这种行为是由 loky 和 notebook 之间的交互引起的,以及 notebook 为子进程处理标准 IO 流的方式.

the regular stderr and stdout streams appear to remain attached to that terminal, while the notebook session will start in a new browser window. Running the posted code snippet in the notebook does produce the expected output, but it seems to go to stdout and ends up in the terminal (as hinted in the Note in the question). This further supports the suspicion that this behavior is caused by the interaction between loky and notebook, and the way the standard IO streams are handled by notebook for child processes.

这让我在 github 上这个讨论(在发布后的过去 2 周),笔记本的作者似乎意识到了这一点,但目前似乎没有明显且快速的解决方案.

This lead me to this discussion on github (active within the past 2 weeks as of this posting) where the authors of notebook appear to be aware of this, but it would seem that there is no obvious and quick fix for the issue at the moment.

如果您不介意切换 Parallel 用于生成子项的后端,您可以这样做:

If you don't mind switching the backend that Parallel uses to spawn children, you can do so like this:

from joblib import Parallel, delayed
Parallel(n_jobs=8, backend='multiprocessing')(delayed(print)(i) for i in range(10))

使用 multiprocessing 后端,事情按预期工作.threading 看起来也工作正常.这可能不是您希望的解决方案,但希望在笔记本作者努力寻找合适的解决方案时就足够了.

with the multiprocessing backend, things work as expected. threading looks to work fine too. This may not be the solution you were hoping for, but hopefully it is sufficient while the notebook authors work on finding a proper solution.

我会将这个交叉发布到 GitHub,以防有人愿意添加到这个答案中(我不想错误陈述任何人的意图或把话放在人们的嘴里!).

I'll cross-post this to GitHub in case anyone there cares to add to this answer (I don't want to misstate anyone's intent or put words in people mouths!).

测试环境:
MacOS - 莫哈韦沙漠 (10.14)
Python - 3.7.3
pip3 - 19.3.1

Test Environment:
MacOS - Mojave (10.14)
Python - 3.7.3
pip3 - 19.3.1

在 2 种配置中测试.确认在对 backend 参数使用 multiprocessingthreading 时产生预期的输出.使用 pip3 安装包.

Tested in 2 configurations. Confirmed to produce the expected output when using both multiprocessing and threading for the backend parameter. Packages install using pip3.

设置 1:

ipykernel                               5.1.1
ipython                                 7.5.0
jupyter                                 1.0.0
jupyter-client                          5.2.4
jupyter-console                         6.0.0
jupyter-core                            4.4.0
notebook                                5.7.8

设置 2:

ipykernel                               5.1.4
ipython                                 7.12.0
jupyter                                 1.0.0
jupyter-client                          5.3.4
jupyter-console                         6.1.0
jupyter-core                            4.6.2
notebook                                6.0.3

我也成功地使用了与Setup 2"相同的版本,但将 notebook 包版本降级为 6.0.2.

I also was successful using the same versions as 'Setup 2' but with the notebook package version downgraded to 6.0.2.

这种方法在 Windows 上的效果不一致.不同的软件版本组合会产生不同的结果.做最直观的事情——将所有东西升级到最新版本——并不能保证它会起作用.

This approach works inconsistently on Windows. Different combinations of software versions yield different results. Doing the most intuitive thing-- upgrading everything to the latest version-- does not guarantee it will work.

这篇关于在 jupyter notebook 中使用 joblib 时不显示打印输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆