Python中混合多处理和线程的现状 [英] Status of mixing multiprocessing and threading in Python

查看:23
本文介绍了Python中混合多处理和线程的现状的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

关于问题 6721,在 Linux 中的同一个 Python 应用程序中使用多处理线程和用户线程的最佳实践或解决方法是什么,python 标准库中的锁应该在 fork 上进行清理?

What are best practices or work-arounds for using both multiprocessing and user threads in the same python application in Linux with respect to Issue 6721, Locks in python standard library should be sanitized on fork?

为什么我需要两者?我使用子进程进行大量计算,这些计算产生的数据结构结果太大而无法通过队列返回——而是必须立即将它们存储到磁盘.让这些子进程中的每一个都由一个单独的线程监视似乎很有效,这样当完成时,线程可以处理将大(例如多 GB)数据读回进程的 IO,在该进程中需要结果以供进一步计算结合其他子进程的结果.子进程会间歇性地挂起,我只是(经过多次头部撞击)发现这是使用日志记录模块引起"的.其他人在这里记录了这个问题:

Why do I need both? I use child processes to do heavy computation that produce data structure results that are much too large to return through a queue -- rather they must be immediately stored to disk. It seemed efficient to have each of these child processes monitored by a separate thread, so that when finished, the thread could handle the IO of reading the large (eg multi GB) data back into the process where the result was needed for further computation in combination with the results of other child processes. The children processes would intermittently hang, which I just (after much head pounding) found was 'caused' by using the logging module. Others have documented the problem here:

https://twiki.cern.ch/twiki/bin/view/Main/PythonLoggingThreadingMultiprocessingIntermixedStudy

这指向了这个明显未解决的 python 问题:python 标准库中的锁应该在 fork 上清理;http://bugs.python.org/issue6721

which points to this apparently unsolved python issue: Locks in python standard library should be sanitized on fork; http://bugs.python.org/issue6721

对我追踪此事的困难感到震惊,我回答:

Alarmed at the difficulty I had tracking this down, I answered:

有没有不在 Python 中混合多处理和线程模块的原因

小心"的相当无用的建议和指向上述内容的链接.

with the rather unhelpful suggestion to 'Be careful' and links to the above.

但是关于:问题 6721 的冗长讨论表明在同一应用程序中同时使用多处理(或 os.fork)和用户线程是一个错误".由于我对这个问题的理解有限,我在讨论中发现太多分歧,无法得出在同一应用程序中同时使用多处理和线程的解决方法或策略是什么.通过禁用日志记录解决了我的直接问题,但我在父进程和子进程中创建了少量其他(显式)锁,并怀疑我正在为进一步的间歇性死锁做准备.

But the lengthy discussion re: Issue 6721 suggests that it is a 'bug' to use both multiprocessing (or os.fork) and user threads in the same application. With my limited understanding of the problem, I find too much disagreement in the discussion to conclude what are the work-arounds or strategies for using both multiprocessing and threading in the same application. My immediate problem was solved by disabling logging, but I create a small handful of other (explicit) locks in both parent and child processes, and suspect I am setting myself up for further intermittent deadlocks.

在 python (2.7,3.2,3.3) 应用程序中使用锁和/或日志记录模块时,您能否给出避免死锁的实用建议?

Can you give practical recommendations to avoid deadlocks while using locks and/or the logging module while using threading and multiprocessing in a python (2.7,3.2,3.3) application?

推荐答案

如果您在程序中仍然只有一个线程(即,在产生工作线程之前从主线程分叉)时分叉其他进程,您将是安全的).

You will be safe if you fork off additional processes while you still have only one thread in your program (that is, fork from main thread, before spawning worker threads).

您的用例看起来甚至不需要多处理模块;您可以使用子进程(甚至更简单的类似 os.system 的调用).

Your use case looks like you don't even need multiprocessing module; you can use subprocess (or even simpler os.system-like calls).

另见 从线程内分叉是否安全?

这篇关于Python中混合多处理和线程的现状的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆