使用多处理模块的脚本不会终止 [英] Script using multiprocessing module does not terminate
问题描述
以下代码,不打印"here"
.问题是什么?
我在两台计算机(Windows 7,Ubuntu 12.10)和
http://www.compileonline.com/execute_python_online.php
并非在所有情况下都打印"here"
.
The following code, does not print "here"
. What is the problem?
I tested it on both my machines (windows 7, Ubuntu 12.10), and
http://www.compileonline.com/execute_python_online.php
It does not print "here"
in all cases.
from multiprocessing import Queue, Process
def runLang(que):
print "start"
myDict=dict()
for i in xrange(10000):
myDict[i]=i
que.put(myDict)
print "finish"
def run(fileToAnalyze):
que=Queue()
processList=[]
dicList=[]
langs= ["chi","eng"]
for lang in langs:
p=Process(target=runLang,args=(que,))
processList.append(p)
p.start()
for p1 in processList:
p1.join()
print "here"
for _ in xrange(len(langs)):
item=que.get()
print item
dicList.append(item)
if __name__=="__main__":
processList = []
for fileToAnalyse in ["abc.txt","def.txt"]:
p=Process(target=run,args=(fileToAnalyse,))
processList.append(p)
p.start()
for p1 in processList:
p1.join()
推荐答案
这是因为,当您将put
大量项目放入multiprocessing.Queue
中时,一旦底层Pipe
已满,它们最终将被缓冲在内存中.在Queue
的另一端开始读取内容之前,不会刷新缓冲区,这将允许Pipe
接受更多数据.在所有Queue
实例的缓冲区完全刷新到其基础Pipe
之前,Process
不能终止.这意味着如果您尝试join
一个进程而没有另一个进程/线程在其Queue
上调用get
,则可能会死锁.这是文档中提到的 :
This is because when you put
lots of items into a multiprocessing.Queue
, they eventually get buffered in memory, once the underlying Pipe
is full. The buffer won't get flushed until something starts reading from the other end of the Queue
, which will allow the Pipe
to accept more data. A Process
cannot terminate until the buffer for all its Queue
instances have been entirely flushed to their underlying Pipe
. The implication of this is that if you try to join
a process without having another process/thread calling get
on its Queue
, you could deadlock. This is mentioned in the docs:
警告
如上所述,如果子进程已将项目放入队列中(并且
它没有使用JoinableQueue.cancel_join_thread
),那么该过程
直到所有缓冲的项目都已刷新到
管道.
As mentioned above, if a child process has put items on a queue (and
it has not used JoinableQueue.cancel_join_thread
), then that process
will not terminate until all buffered items have been flushed to the
pipe.
这意味着,如果您尝试加入该过程,则可能会陷入僵局. 除非您确定所有已放入队列中的项目 已被消耗.同样,如果子进程是非守护进程 那么当父进程尝试加入其所有进程时,其父进程可能会在退出时挂起 非守护儿童.
This means that if you try joining that process you may get a deadlock unless you are sure that all items which have been put on the queue have been consumed. Similarly, if the child process is non-daemonic then the parent process may hang on exit when it tries to join all its non-daemonic children.
请注意,使用管理器创建的队列不存在此问题.
Note that a queue created using a manager does not have this issue.
您可以通过在父级中清空Queue
之前不调用join
来解决此问题:
You can fix the issue by not calling join
until after you empty the Queue
in the parent:
for _ in xrange(len(langs)):
item = que.get()
print(item)
dicList.append(item)
# join after emptying the queue.
for p in processList:
p.join()
print("here")
这篇关于使用多处理模块的脚本不会终止的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!