使用 Python 的 select 模块检查是否有更多数据要从文件描述符中读取 [英] Checking to see if there is more data to read from a file descriptor using Python's select module

查看:32
本文介绍了使用 Python 的 select 模块检查是否有更多数据要从文件描述符中读取的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个在线程中创建子进程的程序,以便线程可以不断检查特定的输出条件(来自 stdout 或 stderr),并调用适当的回调,而程序的其余部分继续.这是该代码的精简版:

I have a program that creates a subprocess within a thread, so that the thread can be constantly checking for specific output conditions (from either stdout or stderr), and call the appropriate callbacks, while the rest of the program continues. Here is a pared-down version of that code:

import select
import subprocess
import threading

def run_task():
    command = ['python', 'a-script-that-outputs-lines.py']
    proc = subprocess.Popen(command, stdout = subprocess.PIPE, stderr = subprocess.PIPE)
    while True:

        ready, _, _ = select.select((proc.stdout, proc.stderr), (), (), .1)

        if proc.stdout in ready:
            next_line_to_process = proc.stdout.readline()
            # process the output

        if proc.stderr in ready:
            next_line_to_process = proc.stderr.readline()
            # process the output

        if not ready and proc.poll() is not None:
            break

thread = threading.Thread(target = run_task)
thread.run()

它运行得相当好,但我希望线程在满足两个条件时退出:正在运行的子进程已完成,并且 stdout 和 stderr 中的所有数据都已处理.

It works reasonably well, but I would like the thread to exit once two conditions are met: the running child process has finished, and all of the data in stdout and stderr has been processed.

我遇到的困难是,如果我的最后一个条件如上所述(if not ready and proc.poll() is not None),那么线程永远不会退出,因为一旦 stdout 和stderr 的文件描述符被标记为就绪,它们永远不会变为未就绪(即使从它们读取了所有数据,并且 read() 将挂起或 readline() 将返回一个空字符串).

The difficulty I have is that if my last condition is as it is above (if not ready and proc.poll() is not None), then the thread never exits, because once stdout and stderr's file descriptors are marked as ready, they never become unready (even after all of the data has been read from them, and read() would hang or readline() would return an empty string).

如果我将该条件更改为仅 if proc.poll() is not None,那么当程序退出时循环存在,我不能保证它看到了所有的数据需要处理.

If I change that condition to just if proc.poll() is not None, then the loop exists when the program exits, and I can't guarantee that it's seen all of the data that needs to be processed.

这只是错误的方法,还是有一种方法可以可靠地确定何时读取了将要写入文件描述符的所有数据?或者这是试图从子进程的 stderr/stdout 读取的特定问题?

Is this just the wrong approach, or is there a way to reliably determine when you've read all of the data that will ever be written to a file descriptor? Or is this an issue specific to trying to read from the stderr/stdout of a subprocess?

我一直在 Python 2.5(在 OS X 上运行)上尝试这个,也尝试过基于 select.poll()select.epoll() 的 Python 变体2.6(在具有 2.6 内核的 D​​ebian 上运行).

I have been trying this on Python 2.5 (running on OS X) and also tried select.poll() and select.epoll()-based variants on Python 2.6 (running on Debian with a 2.6 kernel).

推荐答案

select 模块是合适的,如果你想知道你是否可以在不阻塞的情况下从管道中读取.

select module is appropriate if you want to find out whether you can read from a pipe without blocking.

为确保您已阅读所有数据,请使用更简单的条件 if proc.poll() is not None: break 并调用 rest = [pipe.read() for循环后 [p.stdout, p.stderr]] 中的管道.

To make sure that you've read all data, use a simpler condition if proc.poll() is not None: break and call rest = [pipe.read() for pipe in [p.stdout, p.stderr]] after the loop.

子进程不太可能在关闭之前关闭其 stdout/stderr,因此为了简单起见,您可以跳过处理 EOF 的逻辑.

It is unlikely that a subprocess closes its stdout/stderr before its shutdown therefore you could skip the logic that handles EOF for simplicity.

不要直接调用Thread.run(),而是使用Thread.start().您可能根本不需要这里的单独线程.

Don't call Thread.run() directly, use Thread.start() instead. You probably don't need the separate thread here at all.

不要在select()之后调用p.stdout.readline(),它可能会阻塞,使用os.read(p.stdout.fileno(), limit) 代替.空字节串表示对应管道的 EOF.

Don't call p.stdout.readline() after the select(), it may block, use os.read(p.stdout.fileno(), limit) instead. Empty bytestring indicates EOF for the corresponding pipe.

作为替代或补充,您可以使用 fcntl 模块使管道非阻塞:

As an alternative or in addition to you could make the pipes non-blocking using fcntl module:

import os
from fcntl import fcntl, F_GETFL, F_SETFL

def make_nonblocking(fd):
    return fcntl(fd, F_SETFL, fcntl(fd, F_GETFL) | os.O_NONBLOCK)

并在读取时处理 io/os 错误.

and handle io/os errors while reading.

这篇关于使用 Python 的 select 模块检查是否有更多数据要从文件描述符中读取的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆