使用 subprocess.Popen 处理大输出进程 [英] Using subprocess.Popen for Process with Large Output

查看:41
本文介绍了使用 subprocess.Popen 处理大输出进程的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一些 Python 代码可以执行外部应用程序,当应用程序有少量输出时该应用程序运行良好,但在输出量很大时挂起.我的代码看起来像:

I have some Python code that executes an external app which works fine when the app has a small amount of output, but hangs when there is a lot. My code looks like:

p = subprocess.Popen(cmd, shell=True, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
errcode = p.wait()
retval = p.stdout.read()
errmess = p.stderr.read()
if errcode:
    log.error('cmd failed <%s>: %s' % (errcode,errmess))

文档中的评论似乎表明了潜在问题.在等待之下,有:

There are comments in the docs that seem to indicate the potential issue. Under wait, there is:

警告:如果子进程向stdoutstderr 管道生成足够的输出,从而阻止等待操作系统管道缓冲区接受更多数据,这将导致死锁.使用 communicate() 来避免这种情况.

Warning: This will deadlock if the child process generates enough output to a stdout or stderr pipe such that it blocks waiting for the OS pipe buffer to accept more data. Use communicate() to avoid that.

虽然在沟通下,我看到:

though under communicate, I see:

注意读取的数据是缓存在内存中的,所以如果数据量很大或者没有限制就不要使用这种方法.

Note The data read is buffered in memory, so do not use this method if the data size is large or unlimited.

所以我不清楚如果我有大量数据,我应该使用其中的任何一个.他们没有说明在这种情况下我应该使用什么方法.

So it is unclear to me that I should use either of these if I have a large amount of data. They don't indicate what method I should use in that case.

我确实需要 exec 的返回值,并解析并使用 stdoutstderr.

I do need the return value from the exec and do parse and use both the stdout and stderr.

那么在 Python 中执行一个将有大量输出的外部应用程序的等效方法是什么?

So what is an equivalent method in Python to exec an external app that is going to have large output?

推荐答案

您正在阻止读取两个文件;第一个需要在第二个开始之前完成.如果应用程序向 stderr 写入了很多内容,而 stdout 没有写入任何内容,那么您的进程将等待 stdout 上未出现的数据,而您正在运行的程序坐在那里等待它写入​​ stderr 的内容被读取(它永远不会 - 因为您正在等待 stdout).

You're doing blocking reads to two files; the first needs to complete before the second starts. If the application writes a lot to stderr, and nothing to stdout, then your process will sit waiting for data on stdout that isn't coming, while the program you're running sits there waiting for the stuff it wrote to stderr to be read (which it never will be--since you're waiting for stdout).

有几种方法可以解决这个问题.

There are a few ways you can fix this.

最简单的就是不拦截stderr;离开 stderr=None.错误会直接输出到stderr.您无法拦截它们并将它们显示为您自己的消息的一部分.对于命令行工具,这通常是可以的.对于其他应用,这可能是个问题.

The simplest is to not intercept stderr; leave stderr=None. Errors will be output to stderr directly. You can't intercept them and display them as part of your own message. For commandline tools, this is often OK. For other apps, it can be a problem.

另一种简单的方法是将 stderr 重定向到 stdout,这样您就只有一个传入文件:set stderr=STDOUT.这意味着您无法区分常规输出和错误输出.这可能会也可能不可接受,具体取决于应用程序如何写入输出.

Another simple approach is to redirect stderr to stdout, so you only have one incoming file: set stderr=STDOUT. This means you can't distinguish regular output from error output. This may or may not be acceptable, depending on how the application writes output.

处理这个完整而复杂的方法是 select (http://docs.python.org/library/select.html).这使您可以以非阻塞方式读取:只要数据出现在 stdoutstderr 上,您就可以获得数据.如果真的有必要,我只会推荐这个.这在 Windows 中可能不起作用.

The complete and complicated way of handling this is select (http://docs.python.org/library/select.html). This lets you read in a non-blocking way: you get data whenever data appears on either stdout or stderr. I'd only recommend this if it's really necessary. This probably doesn't work in Windows.

这篇关于使用 subprocess.Popen 处理大输出进程的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆