Python: subprocess.call, stdout to file, stderr to file, 在屏幕上实时显示stderr [英] Python: subprocess.call, stdout to file, stderr to file, display stderr on screen in real time

查看:31
本文介绍了Python: subprocess.call, stdout to file, stderr to file, 在屏幕上实时显示stderr的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个命令行工具(实际上是几个),我正在用 Python 为其编写包装器.

I have a command line tool (actually, several) that I am writing a wrapper for in Python.

这个工具一般是这样使用的:

The tool is generally used like this:

 $ path_to_tool -option1 -option2 > file_out

用户将输出写入 file_out,并且还能够在工具运行时查看工具的各种状态消息.

The user gets the output written to file_out, and is also able to see various status messages of the tool as it is running.

我想复制此行为,同时将 stderr(状态消息)记录到文件中.

I want to replicate this behavior, while also logging stderr (the status messages) to a file.

我拥有的是这个:

from subprocess import call
call(['path_to_tool','-option1','option2'], stdout = file_out, stderr = log_file)

这工作正常,除了 stderr 没有写入屏幕.我当然可以添加代码来将 log_file 的内容打印到屏幕上,但是用户将在一切完成后而不是在它发生时看到它.

This works fine EXCEPT that stderr is not written to the screen. I can add code to print the contents of the log_file to the screen of course, but then the user will see it after everything is done rather than while it is happening.

总结一下,期望的行为是:

To recap, desired behavior is:

  1. 使用 call() 或 subprocess()
  2. 直接输出到文件
  3. 将 stderr 直接写入文件,同时将 stderr 实时写入屏幕,就好像工具已直接从命令行调用.

我觉得我要么错过了一些非常简单的东西,要么比我想象的要复杂得多......感谢您的帮助!

I have a feeling I'm either missing something really simple, or this is much more complicated than I thought...thanks for any help!

这只需要在 Linux 上工作.

this only needs to work on Linux.

推荐答案

可以使用 subprocess 来做到这一点,但这并非微不足道.如果您查看文档中的常用参数,您将看到您可以将 PIPE 作为 stderr 参数传递,这会创建一个新管道,将管道的一侧传递给子进程,并使另一侧侧可用作 stderr 属性.*

You can do this with subprocess, but it's not trivial. If you look at the Frequently Used Arguments in the docs, you'll see that you can pass PIPE as the stderr argument, which creates a new pipe, passes one side of the pipe to the child process, and makes the other side available to use as the stderr attribute.*

因此,您需要维护该管道,写入屏幕和文件.通常,为此获取正确的细节非常棘手.** 就您而言,只有一个管道,并且您计划同步维护它,所以还不错.

So, you will need to service that pipe, writing to the screen and to the file. In general, getting the details right for this is very tricky.** In your case, there's only one pipe, and you're planning on servicing it synchronously, so it's not that bad.

import subprocess
proc = subprocess.Popen(['path_to_tool', '-option1', 'option2'],
                        stdout=file_out, stderr=subprocess.PIPE)
for line in proc.stderr:
    sys.stdout.write(line)
    log_file.write(line)
proc.wait()

(请注意,在 proc.stderr 中使用 for line 存在一些问题:——基本上,如果您正在阅读的内容由于任何原因没有被行缓冲,您可以坐下来等待换行,即使实际上有半行数据需要处理.你可以一次读取块,比如,read(128),甚至 read(1),必要时为了更顺利地获取数据.如果你真的需要在它到达时立即获取每个字节,并且负担不起 read(1) 的成本,你会需要将管道置于非阻塞模式并异步读取.)

(Note that there are some issues using for line in proc.stderr:—basically, if what you're reading turns out not to be line-buffered for any reason, you can sit around waiting for a newline even though there's actually half a line worth of data to process. You can read chunks at a time with, say, read(128), or even read(1), to get the data more smoothly if necessary. If you need to actually get every byte as soon as it arrives, and can't afford the cost of read(1), you'll need to put the pipe in non-blocking mode and read asynchronously.)

但如果您使用的是 Unix,则使用 tee 命令为您完成此操作可能会更简单.

But if you're on Unix, it might be simpler to use the tee command to do it for you.

对于快速而肮脏的解决方案,您可以使用 shell 来通过它.像这样:

For a quick&dirty solution, you can use the shell to pipe through it. Something like this:

subprocess.call('path_to_tool -option1 option2 2|tee log_file 1>2', shell=True,
                stdout=file_out)

但我不想调试shell管道;让我们用 Python 来做,如在文档中所示:

But I don't want to debug shell piping; let's do it in Python, as shown in the docs:

tool = subprocess.Popen(['path_to_tool', '-option1', 'option2'],
                        stdout=file_out, stderr=subprocess.PIPE)
tee = subprocess.Popen(['tee', 'log_file'], stdin=tool.stderr)
tool.stderr.close()
tee.communicate()

<小时>

最后,PyPI 上的子进程和/或 shell 有十几个或更多更高级别的包装器——shshellshell_commandshelloutiterpipessargecmd_utilscommandwrapper 等.对于shell"、子进程"、进程"、命令行"等,找到一个你喜欢的,使问题变得微不足道.


Finally, there are a dozen or more higher-level wrappers around subprocesses and/or the shell on PyPI—sh, shell, shell_command, shellout, iterpipes, sarge, cmd_utils, commandwrapper, etc. Search for "shell", "subprocess", "process", "command line", etc. and find one you like that makes the problem trivial.

如果您需要收集 stderr 和 stdout 怎么办?

What if you need to gather both stderr and stdout?

最简单的方法是将一个重定向到另一个,正如 Sven Marnach 在评论中建议的那样.只需像这样更改 Popen 参数:

The easy way to do it is to just redirect one to the other, as Sven Marnach suggests in a comment. Just change the Popen parameters like this:

tool = subprocess.Popen(['path_to_tool', '-option1', 'option2'],
                        stdout=subprocess.PIPE, stderr=subprocess.STDOUT)

然后在你使用 tool.stderr 的任何地方,使用 tool.stdout 代替——例如,对于最后一个例子:

And then everywhere you used tool.stderr, use tool.stdout instead—e.g., for the last example:

tee = subprocess.Popen(['tee', 'log_file'], stdin=tool.stdout)
tool.stdout.close()
tee.communicate()

但这有一些权衡.最明显的是,将两个流混合在一起意味着您不能将 stdout 记录到 file_out 并将 stderr 记录到 log_file,或者将 stdout 复制到 stdout 并将 stderr 复制到 stderr.但这也意味着排序可能是不确定的——如果子进程在向 stdout 写入任何内容之前总是向 stderr 写入两行,那么一旦混合流,您可能最终会在这两行之间得到一堆 stdout.这意味着它们必须共享 stdout 的缓冲模式,因此如果您依赖 linux/glibc 保证 stderr 是行缓冲的这一事实(除非子进程明确更改它),那可能不再正确.

But this has some tradeoffs. Most obviously, mixing the two streams together means you can't log stdout to file_out and stderr to log_file, or copy stdout to your stdout and stderr to your stderr. But it also means the ordering can be non-deterministic—if the subprocess always writes two lines to stderr before writing anything to stdout, you might end up getting a bunch of stdout between those two lines once you mix the streams. And it means they have to share stdout's buffering mode, so if you were relying on the fact that linux/glibc guarantees stderr to be line-buffered (unless the subprocess explicitly changes it), that may no longer be true.

如果需要分别处理这两个进程,那就更难了.早些时候,我说过,只要您只有一个管道并且可以同步维护它,即时维护管道很容易.如果你有两个管道,这显然不再正确.假设您正在等待 tool.stdout.read(),并且新数据来自 tool.stderr.如果数据太多,可能会导致管道溢出和子进程阻塞.但即使没有发生这种情况,您显然也无法读取和记录 stderr 数据,直到有东西从 stdout 传入.

If you need to handle the two processes separately, it gets more difficult. Earlier, I said that servicing the pipe on the fly is easy as long as you only have one pipe and can service it synchronously. If you have two pipes, that's obviously no longer true. Imagine you're waiting on tool.stdout.read(), and new data comes in from tool.stderr. If there's too much data, it can cause the pipe to overflow and the subprocess to block. But even if that doesn't happen, you obviously won't be able to read and log the stderr data until something comes in from stdout.

如果您使用 pipe-through-tee 解决方案,就可以避免最初的问题……但前提是创建一个同样糟糕的新项目.您有两个 tee 实例,当您在一个实例上调用 communicate 时,另一个实例一直在等待.

If you use the pipe-through-tee solution, that avoids the initial problem… but only by creating a new project that's just as bad. You have two tee instances, and while you're calling communicate on one, the other one is sitting around waiting forever.

因此,无论哪种方式,您都需要某种异步机制.你可以使用线程、select 反应器、gevent 之类的东西来做到这一点.

So, either way, you need some kind of asynchronous mechanism. You can do this is with threads, a select reactor, something like gevent, etc.

这是一个快速而肮脏的例子:

Here's a quick and dirty example:

proc = subprocess.Popen(['path_to_tool', '-option1', 'option2'],
                        stdout=subprocess.PIPE, stderr=subprocess.PIPE)
def tee_pipe(pipe, f1, f2):
    for line in pipe:
        f1.write(line)
        f2.write(line)
t1 = threading.Thread(target=tee_pipe, args=(proc.stdout, file_out, sys.stdout))
t2 = threading.Thread(target=tee_pipe, args=(proc.stderr, log_file, sys.stderr))
t3 = threading.Thread(proc.wait)
t1.start(); t2.start(); t3.start()
t1.join(); t2.join(); t3.join()

但是,在某些边缘情况下这不起作用.(问题是 SIGCHLD 和 SIGPIPE/EPIPE/EOF 到达的顺序.我认为这些都不会影响我们这里,因为我们没有发送任何输入......但不要在没有考虑的情况下相信我通过和/或测试.)subprocess.communicate<3.3+ 中的/code> 函数可以正确处理所有繁琐的细节.但是您可能会发现,使用 PyPI 和 ActiveState 上的异步子流程包装器实现之一,甚至是来自 Twisted 等成熟异步框架的子流程内容,要简单得多.

However, there are some edge cases where that won't work. (The problem is the order in which SIGCHLD and SIGPIPE/EPIPE/EOF arrive. I don't think any of that will affect us here, since we're not sending any input… but don't trust me on that without thinking it through and/or testing.) The subprocess.communicate function from 3.3+ gets all the fiddly details right. But you may find it a lot simpler to use one of the async-subprocess wrapper implementations you can find on PyPI and ActiveState, or even the subprocess stuff from a full-fledged async framework like Twisted.

* 文档并没有真正解释管道是什么,几乎就像他们希望你是一个老 Unix C 手一样......但是一些例子,特别是在 subprocess 模块 部分替换旧函数,展示它们是如何使用的,而且非常简单.

* The docs don't really explain what pipes are, almost as if they expect you to be an old Unix C hand… But some of the examples, especially in the Replacing Older Functions with the subprocess Module section, show how they're used, and it's pretty simple.

** 困难的部分是正确排序两个或多个管道.如果您在一个管道上等待,另一个可能会溢出并阻塞,从而阻止您对另一个管道的等待永远完成.解决这个问题的唯一简单方法是创建一个线程来为每个管道提供服务.(在大多数 *nix 平台上,您可以使用 selectpoll 反应器代替,但要实现跨平台是非常困难的.)模块的源代码,尤其是communicate 及其助手,展示了如何去做吧.(我链接到 3.3,因为在早期版本中,communicate 本身会出错……)这就是为什么,如果您需要更多,请尽可能使用 communicate比一根管子.在您的情况下,您不能使用 communicate,但幸运的是您不需要多个管道.

** The hard part is sequencing two or more pipes properly. If you wait on one pipe, the other may overflow and block, preventing your wait on the other one from ever finishing. The only easy way to get around this is to create a thread to service each pipe. (On most *nix platforms, you can use a select or poll reactor instead, but making that cross-platform is amazingly difficult.) The source to the module, especially communicate and its helpers, shows how to do it. (I linked to 3.3, because in earlier versions, communicate itself gets some important things wrong…) This is why, whenever possible, you want to use communicate if you need more than one pipe. In your case, you can't use communicate, but fortunately you don't need more than one pipe.

这篇关于Python: subprocess.call, stdout to file, stderr to file, 在屏幕上实时显示stderr的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆