多线程环境中Python-Subprocess-Popen行为不一致 [英] Python-Subprocess-Popen inconsistent behavior in a multi-threaded environment
问题描述
我在线程内部运行以下代码..'executable'为每个输入'url'产生唯一的字符串输出:
I have following piece of code running inside thread.. 'executable' produces unique string output for each input 'url':
p = Popen(["executable", url], stdout=PIPE, stderr=PIPE, close_fds=True)
output,error = p.communicate()
print output
当上面的代码针对多个输入``URL''执行时,子进程p的``输出''是不一致的;对于某些URL,子进程被终止而没有产生任何``输出''.我尝试为每个失败的'p'实例打印p.returncode(失败的url在多次运行中也不是一致的),并且将'-11'作为返回码,并将'error'值作为空字符串.有人可以建议一种方法在多线程环境中每次运行都能获得一致的行为/输出?
when above code gets executed for multiple input 'urls', the subprocess p's 'output' produced is not consistent.For some of the urls, subprocess gets terminated without producing any 'output'. I tried printing p.returncode for each failed 'p' instance(failed urls are not consistent across multiple runs either) and got '-11' as a return code with 'error' value as empty string.Can someone please suggest a way to get consistent behavior/output for each run in a multithreaded environment?
推荐答案
-11
作为返回代码可能意味着C程序不正确,例如,您启动了太多子进程,并导致C可执行文件中的SIGSERV
.您可以使用 multiprocessing.ThreadPool,current.futures.ThreadPoolExecutor,线程+基于队列的解决方案来限制并发子进程的数量:
-11
as a return code might mean that C program is not fine e.g., you are starting too many subprocesses and it causes SIGSERV
in the C executable. You can limit number of concurrent subprocesses using multiprocessing.ThreadPool, concurrent.futures.ThreadPoolExecutor, threading + Queue -based solutions:
#!/usr/bin/env python
from multiprocessing.dummy import Pool # uses threads
from subprocess import Popen, PIPE
def get_url(url):
p = Popen(["executable", url], stdout=PIPE, stderr=PIPE, close_fds=True)
output, error = p.communicate()
return url, output, error, p.returncode
pool = Pool(20) # limit number of concurrent subprocesses
for url, output, error, returncode in pool.imap_unordered(get_url, urls):
print("%s %r %r %d" % (url, output, error, returncode))
确保可执行文件可以并行运行,例如,它不使用某些共享资源.要进行测试,您可以在shell中运行:
Make sure the executable can be run in parallel e.g., it doesn't use some shared resource. To test, you could run in a shell:
$ executable url1 & executable url2
能否请您详细解释一下您正在启动太多子进程,这会在C可执行文件中导致SIGSERV",并可能提出避免这种情况的解决方案..
Could you please explain more about "you are starting too many subprocesses and it causes SIGSERV in the C executable" and possibly solution to avoid that..
可能的问题:
- 进程太多"
- ->系统或其他资源不足"
- ->触发C代码中原本为隐藏或罕见的错误"
- ->非法内存访问"
- -> SIGSERV
以上建议的解决方案是:
The suggested above solution is:
- 限制并发进程数"
- ->系统中有足够的内存或其他资源"
- ->错误是隐藏的或罕见的"
- -> 否 SIGSERV
- "limit number of concurrent processes"
- -> "enough memory or other resources in the system"
- -> "bug is hidden or rare"
- -> no SIGSERV
了解在C ++中,SIGSEGV运行时错误是什么?简而言之,如果程序尝试执行,则该信号会杀死您的程序访问它不应该访问的内存.这是此类程序的示例:
Understand what is SIGSEGV run time error in c++? In short, your program is killed with that signal if it tries to access a memory that it is not supposed to. Here's an example of such program:
/* try to fail with SIGSERV sometimes */
#include <stdio.h>
#include <stdlib.h>
#include <time.h>
int main(void) {
char *null_pointer = NULL;
srand((unsigned)time(NULL));
if (rand() < RAND_MAX/2) /* simulate some concurrent condition
e.g., memory pressure */
fprintf(stderr, "%c\n", *null_pointer); /* dereference null pointer */
return 0;
}
如果使用上述Python脚本运行它,则它偶尔会返回-11
.
If you run it with the above Python script then it would return -11
occasionally.
p.returncode还不足以进行调试..是否还有其他选择可获取更多DEBUG信息以找到根本原因?
Also p.returncode is not sufficient for debugging purpose..Is there any other option to get more DEBUG info to get to the root cause?
我不会完全排除Python方面,但最有可能是C程序出了问题.您可以使用 gdb
获取回溯,以查看调用栈中的位置错误来自哪里.
I won't exclude the Python side completely but It is most likely that the problem is the C program. You could use gdb
to get a backtrace to see where in a callstack the error comes from.
这篇关于多线程环境中Python-Subprocess-Popen行为不一致的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!