多处理在进程之间共享不可序列化的对象 [英] Multiprocessing Share Unserializable Objects Between Processes

查看:249
本文介绍了多处理在进程之间共享不可序列化的对象的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有三个可能重复的问题(但过于具体):

There are three questions as possible duplicates (but too specific):

  • How to properly set up multiprocessing proxy objects for objects that already exist
  • Share object with process (multiprocess)
  • Can I use a ProcessPoolExecutor from within a Future?

通过回答此问题,可以回答所有其他三个问题. 希望我能说清楚:

By answering this question all three other questions can be answered. Hopefully I make myself clear:

一旦我通过多处理创建的某个过程中创建了一个对象:

Once I created an object in some process created by multiprocessing:

  1. 如何将对该对象的引用传递给其他进程?
  2. (不是很重要),如何确保在保存引用时此过程不会消失?
  1. How do I pass a reference to that object to an other process?
  2. (not so important) How do I make sure that this process does not die while I hold a reference?

示例1(已解决)

from concurrent.futures import *

def f(v):
    return lambda: v * v

if __name__ == '__main__':
    with ThreadPoolExecutor(1) as e: # works with ThreadPoolExecutor
        l = list(e.map(f, [1,2,3,4]))
    print([g() for g in l]) # [1, 4, 9, 16]

示例2

假设f返回具有可变状态的对象.这个相同的对象应该可以从其他进程访问.

Suppose f returns an object with mutable state. This identical object should be accessible from other processes.

示例3

我有一个带有打开文件和锁的对象-如何授予对其他进程的访问权限?

I have an object which has an open file and a lock - how do I grant access to other processes?

提醒

我不希望此特定错误不会出现.或针对此特定用例的解决方案.解决方案应该足够通用,以仅在进程之间共享不可移动的对象.可以在任何过程中创建对象.使所有对象都可移动并保留身份的解决方案也很好.

I do not want this specific error to not appear. Or a solution to this specific usecase. The solution should be general enough to just share unmovable objects between processes. The objects can potentially be created in any process. A solution that makes all objects movable and preserves identity can be good, too.

欢迎任何提示,任何指向如何实现解决方案的部分解决方案或代码片段都值得.因此,我们可以一起创建解决方案.

Any hints are welcome, any partial solution or code fragments that point at how to implement a solution are worth something. So we can create a solution together.

这里是尝试来解决此问题,但没有进行多重处理: https://github.com/niccokunzmann/pynet/blob/master/documentation/done/tools.rst

Here is an attempt to solve this but without multiprocessing: https://github.com/niccokunzmann/pynet/blob/master/documentation/done/tools.rst

问题

您希望其他过程如何处理这些引用?

What you want the other processes to do with the references?

可以将引用传递给使用multiprocessing创建的任何其他进程(重复3).一个可以访问属性的调用引用.访问的服装可能是代理,也可能不是.

The references can be passed to any other process created with multiprocessing(duplicate 3). One can access attributes, call the reference. Accessed attibutes may or may not be proxies.

仅使用代理有什么问题?

What's the problem with just using a proxy?

也许没有问题,只是一个挑战.我的印象是,代理具有管理器,并且管理器具有其自己的进程,因此必须对无法序列化的对象进行序列化和转移(使用StacklessPython/fork部分解决). 另外,还存在特殊对象的代理-很难但并非不可能为所有对象构建代理(可解决).

Maybe there is no problem but a challenge. My impression was that a proxy has a manager and that a manager has its own process and so the unserializable object must be serialized and transfered (partially solved with StacklessPython/fork). Also there exist proxies for special objects - it is hard but not impossible to build a proxy for all objects (solvable).

解决方案? -代理+经理?

Eric Urban显示序列化不是问题.真正的挑战是在Example2& 3中:状态的同步.我对解决方案的想法是为经理创建一个特殊的代理类.此代理类

Eric Urban showed that serialization is not the problem. The real challenge is in Example2&3: the synchronization of state. My idea of a solution would be to create a special proxy class for a manager. This proxy class

  1. 为无法序列化的对象采用构造器
  2. 获取一个可序列化的对象,并将其传输到管理器进程.
  3. (根据1.的问题),必须在管理器进程中创建无法序列化的对象.

推荐答案

在大多数情况下,实际上并不希望将现有对象的引用传递给另一个进程.相反,您可以创建要在进程之间共享的类:

Most of the time it's not really desirable to pass the reference of an existing object to another process. Instead you create your class you want to share between processes:

class MySharedClass:
    # stuff...

然后,您将像这样创建代理管理器:

Then you make a proxy manager like this:

import multiprocessing.managers as m
class MyManager(m.BaseManager):
    pass # Pass is really enough. Nothing needs to be done here.

然后,您在该Manager上注册课程,如下所示:

Then you register your class on that Manager, like this:

MyManager.register("MySharedClass", MySharedClass)

然后,实例化管理器并启动管理器后,就可以使用manager.start()使用manager.MySharedClass创建类的共享实例.这应该可以满足所有需求.返回的代理的工作原理与原始对象完全相同,除了文档中所述的某些例外.

Then once the manager is instanciated and started, with manager.start() you can create shared instances of your class with manager.MySharedClass. This should work for all needs. The returned proxy works exactly like the original objects, except for some exceptions described in the documentation.

这篇关于多处理在进程之间共享不可序列化的对象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆