在Windows异步子 [英] Asynchronous subprocess on Windows

查看:133
本文介绍了在Windows异步子的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

首先,我解决整体问题是一个比较复杂一点比我在这里展示,所以请不要告诉我,'使用线程与封闭,因为它不会解决我的实际情况,不公平的,公平的位重写和重构。

First of all, the overall problem I am solving is a bit more complicated than I am showing here, so please do not tell me 'use threads with blocking' as it would not solve my actual situation without a fair, FAIR bit of rewriting and refactoring.

我有几个应用,这不是我的修改,从标准输入获取数据,做他们的魔法之后船尾它在标准输出上。我的任务是链几个这样的项目。问题是,有时他们窒息,因此我需要跟踪它的STDERR输出自己的进步。

I have several applications which are not mine to modify, which take data from stdin and poop it out on stdout after doing their magic. My task is to chain several of these programs. Problem is, sometimes they choke, and as such I need to track their progress which is outputted on STDERR.

pA = subprocess.Popen(CommandA,  shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE)
# ... some more processes make up the chain, but that is irrelevant to the problem
pB = subprocess.Popen(CommandB, shell=False, stdout=subprocess.PIPE, stderr=subprocess.PIPE, stdin=pA.stdout )

现在,直接通过pA.stdout.readline()和pB.stdout.readline(),或普通的read()函数读取,是阻塞的问题。由于不同的速度,不同格式的不同应用程序的输出,堵不是一个选项。 (正如我上面写的,线程是不是一种选择,除非在最后的,不得已而为之。) pA.communicate()是死锁安全的,但因为我需要实时的信息,这是不是一种选择,无论是。

Now, reading directly through pA.stdout.readline() and pB.stdout.readline(), or the plain read() functions, is a blocking matter. Since different applications output in different paces and different formats, blocking is not an option. (And as I wrote above, threading is not an option unless at a last, last resort.) pA.communicate() is deadlock safe, but since I need the information live, that is not an option either.

因此​​,谷歌把我带到这个<一个href=\"http://$c$c.activestate.com/recipes/440554-module-to-allow-asynchronous-subprocess-use-on-win/\"相对=nofollow>异步子片段上ActiveState的。

Thus google brought me to this asynchronous subprocess snippet on ActiveState.

起初都好,直到我实现它。在CMD.EXE输出比较 pA.exe的| pB.exe ,忽略了一个事实都输出到同一窗口酝酿了一个烂摊子,我看得很即时更新。然而,我实现使用上述片段和 read_some()宣布有功能相同的事情,它需要10秒以上来通知单管的更新。但是,当它,它拥有领先的更新一路高达40%的进度,例如。

All good at first, until I implement it. Comparing the cmd.exe output of pA.exe | pB.exe, ignoring the fact both output to the same window making for a mess, I see very instantaneous updates. However, I implement the same thing using the above snippet and the read_some() function declared there, and it takes over 10 seconds to notify updates of a single pipe. But when it does, it has updates leading all the way upto 40% progress, for example.

因此​​,我做一些更多的研究,看看有关PeekNamedPipe,匿名手柄众多科目,并返回可用即使0字节有在管可用信息。由于主题已被证明相当有点超出我的专业修复或code左右,我来到堆栈溢出,以寻求指导。 :)

Thus I do some more research, and see numerous subjects concerning PeekNamedPipe, anonymous handles, and returning 0 bytes available even though there is information available in the pipe. As the subject has proven quite a bit beyond my expertise to fix or code around, I come to Stack Overflow to look for guidance. :)

我的平台是W7 64位与Python 2.6,应用程序是32位的情况下,它的事项,并与Unix的兼容性不是一个问题。我甚至可以处理一个完整的ctypes或pywin32解决方案,颠覆子完全,如果它是唯一的解决办法,只要我可以从即时的性能和无死锁每一个标准错误管道读取异步。 :)

My platform is W7 64-bit with Python 2.6, the applications are 32-bit in case it matters, and compatibility with Unix is not a concern. I can even deal with a full ctypes or pywin32 solution that subverts subprocess entirely if it is the only solution, as long as I can read from every stderr pipe asynchronously with immediate performance and no deadlocks. :)

推荐答案

如何坏的是它有使用线程?我遇到过很多同样的问题,最终还是决定使用线程收集了上一个子进程的输出和错误中的所有数据,并把它放到这其中主要的线程可以在阻塞的方式读取线程安全的队列中,而无需担心线程会在幕后。

How bad is it to have to use threads? I encountered much the same problem and eventually decided to use threads to gather up all the data on a sub-process's stdout and stderr and put it onto a thread-safe queue which which the main thread can read in a blocking fashion, without having to worry about the threading going on behind the scenes.

目前尚不清楚你基于线程和阻塞的解决方案预计什么麻烦。您是否担心有让您的code线程安全的休息吗?这不应该是一个问题,因为IO线程将不需要与任何code或数据的其余部分交互。如果你有非常严格的内存需求,或者您的管道是特别长,那么也许你会觉得不高兴产卵这么多线程。我不知道有足够的了解你的情况,所以我不能说,如果这很可能是一个问题,但在我看来,既然你已经关闭产卵过程中的额外线程数与他们进行互动不应该是一个可怕的负担。在我的情况我还没有发现这些的IO线程是尤其成问题。

It's not clear what trouble you anticipate with a solution based on threads and blocking. Are you worried about having to make the rest of your code thread-safe? That shouldn't be an issue since the IO thread wouldn't need to interact with any of the rest of your code or data. If you have very restrictive memory requirements or your pipeline is particularly long then perhaps you may feel unhappy about spawning so many threads. I don't know enough about your situation so I couldn't say if this is likely to be a problem, but it seems to me that since you're already spawning off extra processes a few threads to interact with them should not be a terrible burden. In my situation I have not found these IO threads to be particularly problematic.

我的线程函数看起来是这样的:

My thread function looked something like this:

def simple_io_thread(pipe, queue, tag, stop_event):
    """
    Read line-by-line from pipe, writing (tag, line) to the
    queue. Also checks for a stop_event to give up before
    the end of the stream.
    """
    while True:
        line = pipe.readline()

        while True:
            try:
                # Post to the queue with a large timeout in case the
                # queue is full.
                queue.put((tag, line), block=True, timeout=60)
                break
            except Queue.Full:
                if stop_event.isSet():
                    break
                continue
        if stop_event.isSet() or line=="":
            break
    pipe.close()

当我启动子进程我这样做:

When I start up the subprocess I do this:

outputqueue = Queue.Queue(50)
stop_event = threading.Event()
process = subprocess.Popen(
    command,
    cwd=workingdir,
    env=env,
    shell=useshell,
    stdout=subprocess.PIPE,
    stderr=subprocess.PIPE)
stderr_thread = threading.Thread(
    target=simple_io_thread,
    args=(process.stderr, outputqueue, "STDERR", stop_event)
)
stdout_thread = threading.Thread(
    target=simple_io_thread,
    args=(process.stdout, outputqueue, "STDOUT", stop_event)
)
stderr_thread.daemon = True
stdout_thread.daemon = True
stderr_thread.start()
stdout_thread.start()

然后,当我想读我可以阻止在outputqueue - 每个项目读取它包含一个字符串,以确定它来自哪个管,并从该管道一行文本。很少code在一个单独的线程中运行,并且只通过一个线程安全的队列(加的情况下,一个事件我需要提前放弃)主线程通信。也许这种做法将是有益的,让你来解决线程和阻塞问题,但无需重新编写大量的code的?

Then when I want to read I can just block on outputqueue - each item read from it contains either a string to identify which pipe it came from and a line of text from that pipe. Very little code runs in a separate thread, and it only communicates with the main thread via a thread-safe queue (plus an event in case I need to give up early). Perhaps this approach would be useful and allow you to solve the problem with threads and blocking but without having to rewrite lots of code?

(我的解决方案变得更加复杂,因为有时我真希望提前终止该子进程,并希望确保该线程将全部完成。如果这不是你可以摆脱所有stop_event东西的问题,它成为pretty简洁。)

(My solution is made more complicated because I sometimes wish to terminate the subprocesses early, and want to be sure that the threads will all finish. If that's not an issue you can get rid of all the stop_event stuff and it becomes pretty succinct.)

这篇关于在Windows异步子的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆