在不同的python实例之间共享相同的multiprocessing.Pool对象 [英] Share same multiprocessing.Pool object between different python instances
问题描述
在Python 3中,我需要有一个进程池,其中异步地应用了多个工作进程.
In Python 3 I need to have a Pool of processes in which, asynchronously, apply multiple workers.
问题是我需要从一系列单独的Python进程中将工人发送"到Pool.
因此,所有工作程序都应在相同的Pool
实例中执行.
The problem is that I need to "send" workers to the Pool from a series of separate Python processes.
So, all the worker should be executed in the same Pool
instance.
目的是在不使用所有计算机资源的情况下处理大量数据.
N.B. The objective is to process a lot of data without use all the computer resources.
具有以下multi.py
示例代码:
import multiprocessing
from time import sleep
def worker(x):
sleep(5)
return x*x
if __name__ == "__main__":
pool = multiprocessing.Pool(processes=int(multiprocessing.cpu_count()/2)) # Using half of the CPU cores
for i in range(10):
pool.apply_async(worker, args=(i,))
我需要打开多个multi.py
实例,以将工作程序追加到同一池中.
I need, opening multiple multi.py
instances, to append workers to the same pool.
阅读官方文档我无法理解一种做到这一点的方法.
我知道我需要Manager()
,但应该如何使用呢?
Reading the official documentation I cannot understand a way to do this.
I understood I would need a Manager()
but how should use it?
对于使用Python方式的任何建议,或者任何有有效代码段的人?
Any advice for this in a Python-way or anyone having a working piece of code?
谢谢大家.
推荐答案
最后,我能够使用Python 3 BaseManager
编写一个有效的基本示例.在此处参见文档.
In the end I was able to code a working basic example using Python 3 BaseManager
. See docs here.
在名为server.py的脚本中:
In a script called server.py:
jobs = multiprocessing.Manager().Queue()
BaseManager.register('JobsQueue', callable = lambda: jobs)
m = BaseManager(address=('localhost', 55555), authkey=b'myauthkey')
s = m.get_server()
s.serve_forever()
然后使用一个或多个脚本client.py:
Then in one or more scripts client.py:
BaseManager.register('JobsQueue') # See the difference with the server!
m = BaseManager(address=('localhost', 55555), authkey=b'myauthkey') # Use same authkey! It may work remotely too...
m.connect()
# Then you can put data in the queue
q = m.JobsQueue()
q.put("MY DATA HERE")
# or also
data = q.get()
# etc etc...
显然,这是一个基本示例,但我认为它无需使用外部库就可以完成很多复杂的工作.
Obviously this is a basic example, but I think it offers to do a lot of complex work without using external libraries.
如今,许多人在不了解基本知识的情况下,就想使用现成的,通常是很重的东西,库或软件.我不是其中之一...
A lot of people today look to a ready to use, often massive weight, library or software, without understand the basics. I'm not one of them...
欢呼
这篇关于在不同的python实例之间共享相同的multiprocessing.Pool对象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!