如何从subprocess.communicate()中捕获python中的流输出 [英] How to capture streaming output in python from subprocess.communicate()

查看:1359
本文介绍了如何从subprocess.communicate()中捕获python中的流输出的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

目前,我有这样的事情:

Currently, I have something like this:

self.process = subprocess.Popen(self.cmd, stdout=subprocess.PIPE)
out, err = self.process.communicate()

我正在运行的命令将输出流式处理,在继续之前,我需要阻塞该过程.

The command I'm running streams the output, and I need the process to block before continuing.

我如何做到这一点,以便可以捕获流输出并通过stdout打印流输出?设置stdout=subprocess.PIPE时,我可以捕获输出,但不会打印输出.如果我遗漏了stdout=subprocess.PIPE,它将打印输出,但是communicate()将返回None.

How do I make it so that I can capture the streaming output AND have the streaming output printing through stdout? When I set stdout=subprocess.PIPE, I can capture the output, but it won't print the output. If I leave out stdout=subprocess.PIPE, it prints the output, but communicate() will return None.

有没有一种解决方案可以满足我的要求,可以在进程终止/完成之前提供阻塞,并且可以避免提到的缓冲区问题和管道死锁问题

Is there a solution that would do what I'm asking for WHILE providing blocking until the process is terminated/completed AND avoid buffer issues and pipe deadlock issues mentioned here?

谢谢!

推荐答案

我可以想到一些解决方案.

I can think of a few solutions.

#1:您只需进入源即可获取 communicate 的代码,然后将其复制并粘贴,并添加可打印每行内容并缓冲内容的代码. (如果您自己的stdout可能由于死锁的父母而被阻止,则可以使用threading.Queue之类的东西来代替.)这显然有点棘手,但它很简单,而且很安全.

#1: You can just go into the source to grab the code for communicate, copy and paste it, adding in code that prints each line as it comes in as well as buffering things up. (If its possible for your own stdout to block because of, say, a deadlocked parent, you can use a threading.Queue or something instead.) This is obviously a bit hacky, but it's pretty easy, and will be safe.

但是,实际上,communicate很复杂,因为它需要完全通用,并处理您不需要的情况.您所需要做的就是中心技巧:在问题上抛出线程.您只需要一个专用的读取器线程,它不会降低速度,也不会在read调用之间阻塞.

But really, communicate is complicated because it needs to be fully general, and handle cases you don't. All you need here is the central trick: throw threads at the problem. A dedicated reader thread that doesn't do anything slow or blocking between read calls is all you need.

类似这样的东西:

self.process = subprocess.Popen(self.cmd, stdout=subprocess.PIPE)
lines = []
def reader():
    for line in self.process.stdout:
        lines.append(line)
        sys.stdout.write(line)
t = threading.Thread(target=reader)
t.start()
self.process.wait()
t.join()

您可能需要在reader线程中进行一些错误处理.而且我不确定100%是否可以在此处安全地使用readline.但这要么有效,要么接近.

You may need some error handling in the reader thread. And I'm not 100% sure you can safely use readline here. But this will either work, or be close.

#2:或者,您可以创建一个包装器类,该包装器类接受一个文件对象,并在每次有人read时将其存放到stdout/stderr中.然后,手动创建管道,并传递包裹的管道,而不使用automagic PIPE.这与#1具有完全相同的问题(意味着没有问题,或者您需要使用Queuesys.stdout.write可以阻止的内容).

#2: Or you can create a wrapper class that takes a file object and tees to stdout/stderr every time anyone reads from it. Then create the pipes manually, and pass in wrapped pipes, instead of using the automagic PIPE. This has the exact same issues as #1 (meaning either no issues, or you need to use a Queue or something if sys.stdout.write can block).

类似这样的东西:

class TeeReader(object):
    def __init__(self, input_file, tee_file):
        self.input_file = input_file
        self.tee_file = tee_file
    def read(self, size=-1):
        ret = self.input_file.read(size)
        if ret:
            self.tee_file.write(ret)
        return ret

换句话说,它包装了一个文件对象(或类似对象的东西),并像一个文件对象一样工作. (当您使用PIPE时,process.stdout在Unix上是一个真实的文件对象,但可能只是在Windows上的行为类似.)您需要委派给input_file的任何其他方法都可以直接委派,而无需任何多余的包裹.尝试一下,看看communicate获得了AttributeException查找哪些方法并对其进行了显式编码,或者执行了通常的__getattr__技巧来委派所有内容. PS,如果您担心这种文件对象"的想法,即磁盘存储,请阅读所有内容都是文件在Wikipedia上.

In other words, it wraps a file object (or something that acts like one), and acts like a file object. (When you use PIPE, process.stdout is a real file object on Unix, but may just be something that acts like on on Windows.) Any other methods you need to delegate to input_file can probably be delegated directly, without any extra wrapping. Either try this and see what methods communicate gets AttributeExceptions looking for and code those those explicitly, or do the usual __getattr__ trick to delegate everything. PS, if you're worried about this "file object" idea meaning disk storage, read Everything is a file at Wikipedia.

#3:最后,您可以在PyPI上获取或包含在twisted或其他异步框架中的异步子进程"模块之一,并使用它. (这可以避免死锁问题,但没有保证-您仍然必须确保正确维护管道.)

#3: Finally, you can grab one of the "async subprocess" modules on PyPI or included in twisted or other async frameworks and use that. (This makes it possible to avoid the deadlock problems, but it's not guaranteed—you still have to make sure to services the pipes properly.)

这篇关于如何从subprocess.communicate()中捕获python中的流输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆