线程 IPython Notebooks 的每单元输出 [英] Per-cell output for threaded IPython Notebooks

查看:23
本文介绍了线程 IPython Notebooks 的每单元输出的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我不想把它作为一个问题提出来,因为对于一个相当了不起的工具来说,这似乎是一个完全不合理的功能要求.但是,如果任何读者碰巧熟悉该架构,我很想知道潜在的扩展是否可行.

I don't want to raise this as an issue, because it seems like a completely unreasonable feature request for what is a fairly amazing tool. But if any readers happen to be familiar with the architecture I'd be interested to know if a potential extension seems feasible.

我最近写了一个笔记本,里面有一些简单的线程代码,只是想看看当我运行它时会发生什么.Notebook 代码(tl;dr 它启动了许多在睡眠循环中打印的并行线程)可在 https://gist.github 上找到.com/4562840.

I recently wrote a notebook with some simple threaded code in it, just to see what would happen when I ran it. The notebook code (tl;dr it starts a number of parallel threads that print in a sleep loop) is available at https://gist.github.com/4562840.

通过在代码运行时按几次 SHIFT-RETURN,您可以观察到内核的任何输出都出现在当前单元格的输出区域,而不是运行代码的单元格的输出区域.

By hitting SHIFT-RETURN a few times as the code runs you can observe that any output from the kernel appears in the output area of the current cell, not that of the cell in which the code was run.

我想知道如果单元格的线程处于活动状态,是否有可能显示刷新"按钮,允许异步更新输出区域.理想情况下,如果在所有线程结束后(最终更新后)点击刷新按钮,它就会消失.

I was wondering if it would be possible, if threads were active for a cell, to display a "refresh" button allowing the output area to be updated asynchronously. Ideally the refresh button would disappear if it was clicked after all threads had ended (after a final update).

不过,这取决于能够识别和拦截每个线程的打印输出并将其定向到特定单元格输出的缓冲区.所以,有两个问题.

This would depend, though, on being able to identify and intercept the print output for each thread and direct it to a buffer for the specific cell's output. So, two questions.

  1. 我是否认为 Python 2 的打印语句的硬接线意味着无法使用标准解释器实现此增强功能?

  1. Am I correct in believing the the hard-wiring of Python 2's print statement means that this enhancement can not be implemented with a standard interpreter?

Python 3 的前景是否会更好,因为有可能偷偷摸摸另一个层进入 IPython 内核内的 print() 堆栈?特别是对于那些没有按照 Python 链接到达这里的人,

Are the prospects for Python 3 any better, given that it's possible to sneak another layer into the print() stack inside the IPython kernel?and especially for those who didn't follow a Python link to get here,

[没人期待西班牙宗教裁判所]更一般地说,您能否指出(与语言无关的)示例多个流被传送到一个页面?是否有任何既定的最佳实践用于构建和修改 DOM 来处理这个问题?

[nobody expects the Spanish Inquisition] More generally, can you point to (language-agnostic) examples of multiple streams being delivered into a page? Are there any established best practices for constructing and modifying the DOM to handle this?

推荐答案

更新:

我是否正确地认为 Python 2 的打印语句的硬连接意味着无法使用标准解释器实现此增强功能?

Am I correct in believing the the hard-wiring of Python 2's print statement means that this enhancement can not be implemented with a standard interpreter?

不,打印语句的重要部分根本不是硬连线的.print 只是写入 sys.stdout,它可以是具有 writeflush 方法的任何对象.IPython 已经完全替换了这个对象,以便首先将标准输出发送到笔记本(见下文).

No, the important parts of the print statement are not hardwired at all. print simply writes to sys.stdout, which can be any object with write and flush methods. IPython already completely replaces this object in order to get stdout to the notebook in the first place (see below).

鉴于可以将另一层潜入 IPython 内核内的 print() 堆栈中,Python 3 的前景是否会更好?尤其是对于那些没有通过 Python 链接到达这里的人来说,

Are the prospects for Python 3 any better, given that it's possible to sneak another layer into the print() stack inside the IPython kernel?and especially for those who didn't follow a Python link to get here,

不 - 覆盖 sys.stdout 就是你所需要的,而不是打印本身(见上文、下文和其他地方).Python 3 在这方面没有任何优势.

Nope - overriding sys.stdout is all you need, not print itself (see above, below, and elsewhere). There are no advantages to Python 3 here.

[没人指望西班牙宗教裁判所] 更一般地说,您能指出多个流被传送到一个页面的(与语言无关的)示例吗?

[nobody expects the Spanish Inquisition] More generally, can you point to (language-agnostic) examples of multiple streams being delivered into a page?

当然 - IPython 笔记本本身.它使用消息 ID 和元数据来确定标准输出消息的来源,反过来,这些消息应该在哪里结束.下面,在我对一个显然没有人问过的问题的原始回答中,我展示了一个同时绘制来自多个线程并发运行的单元的输出的示例.

Sure - the IPython notebook itself. It uses message IDs and metadata to determine the origin of stdout messages, and in turn where those messages should end up. Below, in my original answer to a question that apparently nobody asked, I show an example of simultaneously drawing output coming from multiple cells whose threads are running concurrently.

为了获得您想要的刷新行为,您可能需要做两件事:

In order to get the refresh behavior you desire, you would probably need to do two things:

  1. 将 sys.stdout 替换为您自己的对象,该对象使用 IPython 显示协议发送带有您自己的线程识别元数据的消息(例如 threading.current_thread().ident).这应该在上下文管理器中完成(如下所示),因此它只会影响您真正想要的打印语句.
  2. 编写一个 IPython js 插件来处理新格式的 stdout 消息,以便它们不会立即绘制,而是存储在数组中,等待绘制.
  1. replace sys.stdout with your own object that uses the IPython display protocol to send messages with your own thread-identifying metadata (e.g. threading.current_thread().ident). This should be done in a context manager (as below), so it only affects the print statements you actually want it to.
  2. write an IPython js plugin for handling your new format of stdout messages, so that they are not drawn immediately, but rather stored in arrays, waiting to be drawn.

原始答案(错误但相关的问题):

Original answer (wrong, but related question):

它依赖于一些恶作剧和私有 API,但使用当前的 IPython 完全有可能(它可能不会永远存在).

It relies on some shenanigans, and private APIs, but this is totally possible with current IPython (it may not be forever).

这是一个示例笔记本:http://nbviewer.ipython.org/4563193

为了做到这一点,您首先需要了解 IPython 如何获取标准输出到笔记本.这是通过用 OutStream 对象替换 sys.stdout 来完成的.这会缓冲数据,然后在调用 sys.stdout.flush 时通过 zeromq 发送它,并最终出现在浏览器中.

In order to do this, you need to understand how IPython gets stdout to the notebook in the first place. This is done by replacing sys.stdout with an OutStream object. This buffers data, and then sends it over zeromq when sys.stdout.flush is called, and it ultimately ends up in the browser.

现在,如何将输出发送到特定单元格.

Now, how to send output to a particular cell.

IPython 消息协议使用父"标头来标识哪个请求产生了哪个回复.每次你让 IPython 运行一些代码时,它会设置各种对象的父头(包括 sys.stdout),以便它们的副作用消息与导致它们的消息相关联.当您在线程中运行代码时,这意味着当前的 parent_header 只是最近的 execute_request,而不是启动任何给定线程的原始线程.

The IPython message protocol uses a 'parent' header to identify which request produced which reply. Every time you ask IPython to run some code, it sets the parent header of various objects (sys.stdout included), so that their side effect messages are associated with the message that caused them. When you run code in a thread, that means that the current parent_header is just the most recent execute_request, rather than the original one that started any given thread.

考虑到这一点,这里有一个上下文管理器,可以临时将 stdout 的父标头设置为特定值:

With that in mind, here is a context manager that temporarily sets stdout's parent header to a particular value:

import sys
from contextlib import contextmanager


stdout_lock = threading.Lock()

@contextmanager
def set_stdout_parent(parent):
    """a context manager for setting a particular parent for sys.stdout

    the parent determines the destination cell of output
    """
    save_parent = sys.stdout.parent_header

    # we need a lock, so that other threads don't snatch control
    # while we have set a temporary parent
    with stdout_lock:
        sys.stdout.parent_header = parent
        try:
            yield
        finally:
            # the flush is important, because that's when the parent_header actually has its effect
            sys.stdout.flush()
            sys.stdout.parent_header = save_parent

而这里是一个Thread,它在线程启动时记录父进程,并在每次执行打印语句时应用该父级,所以它的行为就像它仍然在原始单元格中一样:

And here is a Thread that records the parent when the thread starts, and applies that parent each time it makes a print statement, so it behaves as if it were still in the original cell:

import threading

class counterThread(threading.Thread):
    def run(self):
        # record the parent when the thread starts
        thread_parent = sys.stdout.parent_header
        for i in range(3):
            time.sleep(2)
            # then ensure that the parent is the same as when the thread started
            # every time we print
            with set_stdout_parent(thread_parent):
                print i

最后,一个将所有内容联系在一起的笔记本,时间戳显示实际并发打印到多个单元格:

And finally, a notebook tying it all together, with timestamps showing actual concurrent printing to multiple cells:

http://nbviewer.ipython.org/4563193/

这篇关于线程 IPython Notebooks 的每单元输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆