运行命令并像在终端中一样近乎实时地分别获取其标准输出和标准错误 [英] Run command and get its stdout, stderr separately in near real time like in a terminal

查看:46
本文介绍了运行命令并像在终端中一样近乎实时地分别获取其标准输出和标准错误的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图在 Python 中找到一种方法来运行其他程序:

  1. 可以记录正在运行的程序的stdout和stderr
  2. 正在运行的程序的 stdout 和 stderr 可以是近乎实时地查看,这样如果子进程挂起,用户可以看到.(即我们不等待执行完成之前向用户打印标准输出/标准错误)
  3. 奖励标准:正在运行的程序不知道它是通过 python 运行的,因此不会做意想不到的事情(比如将其输出分块而不是实时打印,或退出,因为它需要一个终端查看其输出).这个小标准几乎意味着我们需要使用我认为的 pty.

这是我到目前为止所得到的......方法一:

def method1(command):## subprocess.communicate() 将分别给我们标准输出和标准错误,## 但我们必须等到命令执行结束才能打印任何内容.##这意味着如果子进程挂起,我们永远不会知道......proc=subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True, executable='/bin/bash')stdout, stderr = proc.communicate() # 记录两者,但无法实时打印 stdout/stderr打印 ' ######### 实时 ######### '########         不可​​能打印 ' ########## 结果 ########## '打印标准输出:"打印标准输出打印标准输出:"打印标准错误

方法二

def method2(command):## 使用 pexpect 在 pty 中运行我们的命令,我们可以实时看到孩子的标准输出,## 但是我们看不到来自curl google.com"的 stderr,大概是因为它没有连接到 pty?##此外,我不知道除了写出到文件(p.logfile)之外如何记录它.我需要标准输出和标准错误## 作为字符串,而不是磁盘上的文件!从好的方面来说,pexpect 会提供很多额外的功能(如果它有效的话!)proc = pexpect.spawn('/bin/bash', ['-c', command])打印 ' ######### 实时 ######### 'proc.interact()打印 ' ########## 结果 ########## '########         不可​​能

方法三:

def method3(command):## 这个方法和method1很像,完全按照你的要求工作## 如果只有 proc.xxx.read(1) 不会阻塞等待某事.它的作用.所以这是没用的.proc=subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True, executable='/bin/bash')打印 ' ######### 实时 ######### 'out,err,outbuf,errbuf = '','','',''firstToSpeak = 无而 proc.poll() == 无:stdout = proc.stdout.read(1) # 块stderr = proc.stderr.read(1) # 也阻塞如果 firstToSpeak == 无:if stdout != '': firstToSpeak = 'stdout';outbuf,errbuf = stdout,stderrelif stderr != '': firstToSpeak = 'stderr';outbuf,errbuf = stdout,stderr别的:if (stdout != '') or (stderr != ''): outbuf += stdout;错误缓冲 += 标准错误别的:out += outbuf;错误 += 错误缓冲区;如果 firstToSpeak == 'stdout': sys.stdout.write(outbuf+errbuf);sys.stdout.flush()否则:sys.stdout.write(errbuf+outbuf);sys.stdout.flush()firstToSpeak = 无打印 ''打印 ' ########## 结果 ########## '打印标准输出:"打印打印 'STDERR:'打印错误

要尝试这些方法,您需要import sys,subprocess,pexpect

pexpect 是纯 python 的,可以与

一起使用<块引用>

sudo pip install pexpect

我认为该解决方案将涉及 python 的 pty 模块 - 这有点像一种黑色艺术,我找不到任何知道如何使用的人.也许SO知道:)作为提醒,我建议您使用curl www.google.com"作为测试命令,因为它出于某种原因将其状态打印在 stderr 上:D

<小时>

更新-1:
好的,所以 pty 库不适合人类消费.文档本质上是源代码.任何呈现的阻塞而不是异步的解决方案在这里都不起作用.Padraic Cunningham 的线程/队列方法效果很好,尽管添加 pty 支持是不可能的——而且它是脏的"(引用 Freenode 的 #python).似乎唯一适合生产标准代码的解决方案是使用 Twisted 框架,它甚至支持 pty 作为布尔开关来运行进程,就像从 shell 调用它们一样.但是将 Twisted 添加到项目中需要完全重写所有代码.这完全是无赖:/

更新-2:

<块引用>

提供了两个答案,其中一个解决了前两个标准,并且在您只需要标准输出和使用线程和队列的标准错误.另一个答案使用 select,一个用于读取文件描述符的非阻塞方法,以及 pty,一种用于读取文件描述符的方法欺骗"生成的进程,让它相信它是在真实环境中运行的终端就像直接从 Bash 运行一样 - 但可能也可能不有副作用.我希望我能接受这两个答案,因为正确"的方法真的取决于情况和你为什么首先是子处理,但唉,我只能接受一个.

解决方案

正在运行的程序的 stdout 和 stderr 可以单独记录.

您不能使用 pexpect,因为 stdout 和 stderr 都指向同一个 pty,之后就无法将它们分开.

<块引用>

可以近乎实时地查看正在运行的程序的stdout和stderr,这样如果子进程挂了,用户就可以看到.(即,在向用户打印 stdout/stderr 之前,我们不会等待执行完成)

如果子进程的输出不是 tty,那么

<块引用>

正在运行的程序不知道它正在通过 python 运行,因此不会做意外的事情(例如将其输出分块而不是实时打印,或者因为需要终端查看其输出而退出).

看来,您的意思正好相反,即,如果将输出重定向到管道(当您使用 stdout=PIPE 时),您的子进程可能会将其输出分块,而不是尽快刷新每个输出行 在 Python 中).这意味着默认 线程asyncio 解决方案 不会像您的情况那样工作.

有几种解决方法:

  • 该命令可以接受命令行参数,例如 grep --line-bufferedpython -u,以禁用块缓冲.

  • stdbuf 适用于某些程序 即,您可以运行 ['stdbuf', '-oL', '-eL'] + command 使用上面的线程或 asyncio 解决方案,您应该分别获得 stdout、stderr 并且行应该近乎实时地出现:

    #!/usr/bin/env python3导入操作系统导入系统从选择导入选择从子流程导入 Popen, PIPEwith Popen(['stdbuf', '-oL', '-e0', 'curl', 'www.google.com'],stdout=PIPE, stderr=PIPE) 作为 p:可读 = {p.stdout.fileno(): sys.stdout.buffer, #单独记录p.stderr.fileno(): sys.stderr.buffer,}可读时:对于选择中的 fd(可读,[],[])[0]:data = os.read(fd, 1024) # 读取可用如果不是数据:#EOF可读性[fd]别的:可读[fd].write(数据)可读[fd].flush()

  • 最后,您可以尝试使用两个 ptypty + select 解决方案:

    #!/usr/bin/env python3导入错误号导入操作系统进口公司导入系统从选择导入选择从子流程导入 Popen主人,奴隶 = zip(pty.openpty(), pty.openpty())with Popen([sys.executable, '-c', r'''import sys, timeprint('stdout', 1) # 没有显式刷新时间.睡眠(.5)打印('stderr',2,文件=sys.stderr)时间.睡眠(.5)打印('标准输出',3)时间.睡眠(.5)打印('stderr',4,文件=sys.stderr)'''],stdin=slaves[0], stdout=slaves[0], stderr=slaves[1]):对于奴隶中的 fd:os.close(fd) # 没有输入可读 = {masters[0]: sys.stdout.buffer, # 单独记录大师[1]:sys.stderr.buffer,}可读时:对于选择中的 fd(可读,[],[])[0]:尝试:data = os.read(fd, 1024) # 读取可用除了作为 e 的 OSError:如果 e.errno != errno.EIO:提高#XXX清理del readable[fd] # EIO 在某些系统上表示 EOF别的:如果不是数据:#EOF可读性[fd]别的:可读[fd].write(数据)可读[fd].flush()对于大师中的 fd:os.close(fd)

    我不知道对 stdout、stderr 使用不同的 pty 有什么副作用.您可以尝试在您的情况下单个 pty 是否足够,例如,设置 stderr=PIPE 并使用 p.stderr.fileno() 而不是 masters[1].sh 源中的评论表明存在问题,如果stderr 不在 {STDOUT, pipe}

I am trying to find a way in Python to run other programs in such a way that:

  1. The stdout and stderr of the program being run can be logged separately.
  2. The stdout and stderr of the program being run can be viewed in near-real time, such that if the child process hangs, the user can see. (i.e. we do not wait for execution to complete before printing the stdout/stderr to the user)
  3. Bonus criteria: The program being run does not know it is being run via python, and thus will not do unexpected things (like chunk its output instead of printing it in real-time, or exit because it demands a terminal to view its output). This small criteria pretty much means we will need to use a pty I think.

Here is what i've got so far... Method 1:

def method1(command):
    ## subprocess.communicate() will give us the stdout and stderr sepurately, 
    ## but we will have to wait until the end of command execution to print anything.
    ## This means if the child process hangs, we will never know....
    proc=subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True, executable='/bin/bash')
    stdout, stderr = proc.communicate() # record both, but no way to print stdout/stderr in real-time
    print ' ######### REAL-TIME ######### '
    ########         Not Possible
    print ' ########## RESULTS ########## '
    print 'STDOUT:'
    print stdout
    print 'STDOUT:'
    print stderr

Method 2

def method2(command):
    ## Using pexpect to run our command in a pty, we can see the child's stdout in real-time,
    ## however we cannot see the stderr from "curl google.com", presumably because it is not connected to a pty?
    ## Furthermore, I do not know how to log it beyond writing out to a file (p.logfile). I need the stdout and stderr
    ## as strings, not files on disk! On the upside, pexpect would give alot of extra functionality (if it worked!)
    proc = pexpect.spawn('/bin/bash', ['-c', command])
    print ' ######### REAL-TIME ######### '
    proc.interact()
    print ' ########## RESULTS ########## '
    ########         Not Possible

Method 3:

def method3(command):
    ## This method is very much like method1, and would work exactly as desired
    ## if only proc.xxx.read(1) wouldn't block waiting for something. Which it does. So this is useless.
    proc=subprocess.Popen(command, stdout=subprocess.PIPE, stderr=subprocess.PIPE, shell=True, executable='/bin/bash')
    print ' ######### REAL-TIME ######### '
    out,err,outbuf,errbuf = '','','',''
    firstToSpeak = None
    while proc.poll() == None:
            stdout = proc.stdout.read(1) # blocks
            stderr = proc.stderr.read(1) # also blocks
            if firstToSpeak == None:
                if stdout != '': firstToSpeak = 'stdout'; outbuf,errbuf = stdout,stderr
                elif stderr != '': firstToSpeak = 'stderr'; outbuf,errbuf = stdout,stderr
            else:
                if (stdout != '') or (stderr != ''): outbuf += stdout; errbuf += stderr
                else:
                    out += outbuf; err += errbuf;
                    if firstToSpeak == 'stdout': sys.stdout.write(outbuf+errbuf);sys.stdout.flush()
                    else: sys.stdout.write(errbuf+outbuf);sys.stdout.flush()
                    firstToSpeak = None
    print ''
    print ' ########## RESULTS ########## '
    print 'STDOUT:'
    print out
    print 'STDERR:'
    print err

To try these methods out, you will need to import sys,subprocess,pexpect

pexpect is pure-python and can be had with

sudo pip install pexpect

I think the solution will involve python's pty module - which is somewhat of a black art that I cannot find anyone who knows how to use. Perhaps SO knows :) As a heads-up, i recommend you use 'curl www.google.com' as a test command, because it prints its status out on stderr for some reason :D


UPDATE-1:
OK so the pty library is not fit for human consumption. The docs, essentially, are the source code. Any presented solution that is blocking and not async is not going to work here. The Threads/Queue method by Padraic Cunningham works great, although adding pty support is not possible - and it's 'dirty' (to quote Freenode's #python). It seems like the only solution fit for production-standard code is using the Twisted framework, which even supports pty as a boolean switch to run processes exactly as if they were invoked from the shell. But adding Twisted into a project requires a total rewrite of all the code. This is a total bummer :/

UPDATE-2:

Two answers were provided, one of which addresses the first two criteria and will work well where you just need both the stdout and stderr using Threads and Queue. The other answer uses select, a non-blocking method for reading file descriptors, and pty, a method to "trick" the spawned process into believing it is running in a real terminal just as if it was run from Bash directly - but may or may not have side-effects. I wish I could accept both answers, because the "correct" method really depends on the situation and why you are subprocessing in the first place, but alas, I could only accept one.

解决方案

The stdout and stderr of the program being run can be logged separately.

You can't use pexpect because both stdout and stderr go to the same pty and there is no way to separate them after that.

The stdout and stderr of the program being run can be viewed in near-real time, such that if the child process hangs, the user can see. (i.e. we do not wait for execution to complete before printing the stdout/stderr to the user)

If the output of a subprocess is not a tty then it is likely that it uses a block buffering and therefore if it doesn't produce much output then it won't be "real time" e.g., if the buffer is 4K then your parent Python process won't see anything until the child process prints 4K chars and the buffer overflows or it is flushed explicitly (inside the subprocess). This buffer is inside the child process and there are no standard ways to manage it from outside. Here's picture that shows stdio buffers and the pipe buffer for command 1 | command2 shell pipeline:

The program being run does not know it is being run via python, and thus will not do unexpected things (like chunk its output instead of printing it in real-time, or exit because it demands a terminal to view its output).

It seems, you meant the opposite i.e., it is likely that your child process chunks its output instead of flushing each output line as soon as possible if the output is redirected to a pipe (when you use stdout=PIPE in Python). It means that the default threading or asyncio solutions won't work as is in your case.

There are several options to workaround it:

  • the command may accept a command-line argument such as grep --line-buffered or python -u, to disable block buffering.

  • stdbuf works for some programs i.e., you could run ['stdbuf', '-oL', '-eL'] + command using the threading or asyncio solution above and you should get stdout, stderr separately and lines should appear in near-real time:

    #!/usr/bin/env python3
    import os
    import sys
    from select import select
    from subprocess import Popen, PIPE
    
    with Popen(['stdbuf', '-oL', '-e0', 'curl', 'www.google.com'],
               stdout=PIPE, stderr=PIPE) as p:
        readable = {
            p.stdout.fileno(): sys.stdout.buffer, # log separately
            p.stderr.fileno(): sys.stderr.buffer,
        }
        while readable:
            for fd in select(readable, [], [])[0]:
                data = os.read(fd, 1024) # read available
                if not data: # EOF
                    del readable[fd]
                else: 
                    readable[fd].write(data)
                    readable[fd].flush()
    

  • finally, you could try pty + select solution with two ptys:

    #!/usr/bin/env python3
    import errno
    import os
    import pty
    import sys
    from select import select
    from subprocess import Popen
    
    masters, slaves = zip(pty.openpty(), pty.openpty())
    with Popen([sys.executable, '-c', r'''import sys, time
    print('stdout', 1) # no explicit flush
    time.sleep(.5)
    print('stderr', 2, file=sys.stderr)
    time.sleep(.5)
    print('stdout', 3)
    time.sleep(.5)
    print('stderr', 4, file=sys.stderr)
    '''],
               stdin=slaves[0], stdout=slaves[0], stderr=slaves[1]):
        for fd in slaves:
            os.close(fd) # no input
        readable = {
            masters[0]: sys.stdout.buffer, # log separately
            masters[1]: sys.stderr.buffer,
        }
        while readable:
            for fd in select(readable, [], [])[0]:
                try:
                    data = os.read(fd, 1024) # read available
                except OSError as e:
                    if e.errno != errno.EIO:
                        raise #XXX cleanup
                    del readable[fd] # EIO means EOF on some systems
                else:
                    if not data: # EOF
                        del readable[fd]
                    else:
                        readable[fd].write(data)
                        readable[fd].flush()
    for fd in masters:
        os.close(fd)
    

    I don't know what are the side-effects of using different ptys for stdout, stderr. You could try whether a single pty is enough in your case e.g., set stderr=PIPE and use p.stderr.fileno() instead of masters[1]. Comment in sh source suggests that there are issues if stderr not in {STDOUT, pipe}

这篇关于运行命令并像在终端中一样近乎实时地分别获取其标准输出和标准错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆