使用启动方法“spawn"的 Python 多处理不起作用 [英] Python multiprocessing with start method 'spawn' doesn't work

查看:84
本文介绍了使用启动方法“spawn"的 Python 多处理不起作用的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我编写了一个 Python 类来并行绘制 pylots.它在默认启动方法是 fork 的 Linux 上运行良好,但是当我在 Windows 上尝试时遇到了问题(可以使用 spawn start 方法在 Linux 上重现 - 请参阅下面的代码).我总是最终收到此错误:

I wrote a Python class to plot pylots in parallel. It works fine on Linux where the default start method is fork but when I tried it on Windows I ran into problems (which can be reproduced on Linux using the spawn start method - see code below). I always end up getting this error:

Traceback (most recent call last):
  File "test.py", line 50, in <module>
    test()
  File "test.py", line 7, in test
    asyncPlotter.saveLinePlotVec3("test")
  File "test.py", line 41, in saveLinePlotVec3
    args=(test, ))
  File "test.py", line 34, in process
    p.start()
  File "C:\Users\adrian\AppData\Local\Programs\Python\Python37\lib\multiprocessing\process.py", line 112, in start
    self._popen = self._Popen(self)
  File "C:\Users\adrian\AppData\Local\Programs\Python\Python37\lib\multiprocessing\context.py", line 223, in _Popen
    return _default_context.get_context().Process._Popen(process_obj)
  File "C:\Users\adrian\AppData\Local\Programs\Python\Python37\lib\multiprocessing\context.py", line 322, in _Popen
    return Popen(process_obj)
  File "C:\Users\adrian\AppData\Local\Programs\Python\Python37\lib\multiprocessing\popen_spawn_win32.py", line 89, in __init__
    reduction.dump(process_obj, to_child)
  File "C:\Users\adrian\AppData\Local\Programs\Python\Python37\lib\multiprocessing\reduction.py", line 60, in dump
    ForkingPickler(file, protocol).dump(obj)
TypeError: can't pickle weakref objects

C:\Python\MonteCarloTools>Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "C:\Users\adrian\AppData\Local\Programs\Python\Python37\lib\multiprocessing\spawn.py", line 99, in spawn_main
    new_handle = reduction.steal_handle(parent_pid, pipe_handle)
  File "C:\Users\adrian\AppData\Local\Programs\Python\Python37\lib\multiprocessing\reduction.py", line 82, in steal_handle
    _winapi.PROCESS_DUP_HANDLE, False, source_pid)
OSError: [WinError 87] The parameter is incorrect

我希望有一种方法可以使此代码适用于 Windows.这是 Linux 和 Windows 上可用的不同启动方法的链接:https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods

I hope there is a way to make this code work for Windows. Here a link to the different start methods available on Linux and Windows: https://docs.python.org/3/library/multiprocessing.html#contexts-and-start-methods

import multiprocessing as mp
def test():

    manager = mp.Manager()
    asyncPlotter = AsyncPlotter(manager.Value('i', 0))

    asyncPlotter.saveLinePlotVec3("test")
    asyncPlotter.saveLinePlotVec3("test")

    asyncPlotter.join()


class AsyncPlotter():

    def __init__(self, nc, processes=mp.cpu_count()):

        self.nc = nc
        self.pids = []
        self.processes = processes


    def linePlotVec3(self, nc, processes, test):

        self.waitOnPool(nc, processes)

        print(test)

        nc.value -= 1


    def waitOnPool(self, nc, processes):

        while nc.value >= processes:
            time.sleep(0.1)
        nc.value += 1


    def process(self, target, args):

        ctx = mp.get_context('spawn') 
        p = ctx.Process(target=target, args=args)
        p.start()
        self.pids.append(p)


    def saveLinePlotVec3(self, test):

        self.process(target=self.linePlotVec3,
                       args=(self.nc, self.processes, test))


    def join(self):
        for p in self.pids:
            p.join()


if __name__=='__main__':
    test()

推荐答案

当使用 spawn start 方法时,Process 对象本身被腌制以供子进程使用过程.在您的代码中,target=target 参数是 AsyncPlotter 的绑定方法.看起来整个 asyncPlotter 实例也必须被腌制才能工作,其中包括 self.manager,它显然不想被腌制.

When using the spawn start method, the Process object itself is being pickled for use in the child process. In your code, the target=target argument is a bound method of AsyncPlotter. It looks like the entire asyncPlotter instance must also be pickled for that to work, and that includes self.manager, which apparently doesn't want to be pickled.

简而言之,将 Manager 放在 AsyncPlotter 之外.这适用于我的 macOS 系统:

In short, keep Manager outside of AsyncPlotter. This works on my macOS system:

def test():
    manager = mp.Manager()
    asyncPlotter = AsyncPlotter(manager.Value('i', 0))
    ...

此外,如您的评论中所述,asyncPlotter 在重用时不起作用.我不知道细节,但看起来它与 Value 对象如何跨进程共享有关.test 函数需要像这样:

Also, as noted in your comment, asyncPlotter did not work when reused. I don't know the details but looks like it has something to do with how the Value object is shared across processes. The test function would need to be like:

def test():
    manager = mp.Manager()
    nc = manager.Value('i', 0)

    asyncPlotter1 = AsyncPlotter(nc)
    asyncPlotter1.saveLinePlotVec3("test 1")
    asyncPlotter2 = AsyncPlotter(nc)
    asyncPlotter2.saveLinePlotVec3("test 2")

    asyncPlotter1.join()
    asyncPlotter2.join()

总而言之,您可能想要重构代码并使用 一个进程池.它已经处理了 AsyncPlottercpu_count 和并行执行所做的事情:

All in all, you might want to restructure your code and use a process pool. It already handles what AsyncPlotter is doing with cpu_count and parallel execution:

from multiprocessing import Pool, set_start_method
from random import random
import time

def linePlotVec3(test):
    time.sleep(random())
    print("test", test)

if __name__ == "__main__":
    set_start_method("spawn")
    with Pool() as pool:
        pool.map(linePlotVec3, range(20))

或者你可以使用 ProcessPoolExecutor几乎相同的事情.此示例一次启动一项任务,而不是映射到列表:

Or you could use a ProcessPoolExecutor to do pretty much the same thing. This example starts tasks one at a time instead of mapping to a list:

from concurrent.futures import ProcessPoolExecutor
import multiprocessing as mp
import time
from random import random

def work(i):
    r = random()
    print("work", i, r)
    time.sleep(r)

def main():
    ctx = mp.get_context("spawn")
    with ProcessPoolExecutor(mp_context=ctx) as pool:
        for i in range(20):
            pool.submit(work, i)

if __name__ == "__main__":
    main()

这篇关于使用启动方法“spawn"的 Python 多处理不起作用的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆