多进程在进程之间共享不可序列化的对象 [英] Multiprocessing Share Unserializable Objects Between Processes

查看:33
本文介绍了多进程在进程之间共享不可序列化的对象的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有三个问题可能重复(但过于具体):

There are three questions as possible duplicates (but too specific):

通过回答这个问题,可以回答所有其他三个问题.希望我说清楚:

By answering this question all three other questions can be answered. Hopefully I make myself clear:

一旦我在多处理创建的某个进程中创建了一个对象:

Once I created an object in some process created by multiprocessing:

  1. 如何将对该对象的引用传递给其他进程?
  2. (不那么重要)如何确保在我持有参考文献时此过程不会终止?

示例 1(已解决)

from concurrent.futures import *

def f(v):
    return lambda: v * v

if __name__ == '__main__':
    with ThreadPoolExecutor(1) as e: # works with ThreadPoolExecutor
        l = list(e.map(f, [1,2,3,4]))
    print([g() for g in l]) # [1, 4, 9, 16]

示例 2

假设 f 返回一个具有可变状态的对象.这个相同的对象应该可以从其他进程访问.

Suppose f returns an object with mutable state. This identical object should be accessible from other processes.

示例 3

我有一个对象,它有一个打开的文件和一个锁 - 我如何授予对其他进程的访问权限?

I have an object which has an open file and a lock - how do I grant access to other processes?

提醒

我不希望这个特定错误不出现.或者这个特定用例的解决方案.该解决方案应该足够通用,以便在进程之间共享不可移动的对象.对象可以在任何进程中潜在地创建.使所有对象都可移动并保留身份的解决方案也不错.

I do not want this specific error to not appear. Or a solution to this specific usecase. The solution should be general enough to just share unmovable objects between processes. The objects can potentially be created in any process. A solution that makes all objects movable and preserves identity can be good, too.

欢迎提供任何提示,任何指向如何实现解决方案的部分解决方案或代码片段都是值得的.所以我们可以一起创建一个解决方案.

Any hints are welcome, any partial solution or code fragments that point at how to implement a solution are worth something. So we can create a solution together.

这是一个尝试来解决这个问题,但没有多处理:https://github.com/niccokunzmann/pynet/blob/master/documentation/done/tools.rst

Here is an attempt to solve this but without multiprocessing: https://github.com/niccokunzmann/pynet/blob/master/documentation/done/tools.rst

问题

您希望其他进程对引用做什么?

What you want the other processes to do with the references?

引用可以传递给使用多处理创建的任何其他进程(重复 3).一个可以访问属性,调用引用.访问的属性可能是也可能不是代理.

The references can be passed to any other process created with multiprocessing(duplicate 3). One can access attributes, call the reference. Accessed attibutes may or may not be proxies.

只使用代理有什么问题?

What's the problem with just using a proxy?

也许没有问题,只有挑战.我的印象是代理有管理器,管理器有自己的进程,因此不可序列化的对象必须序列化和传输(部分使用 StacklessPython/fork 解决).还存在特殊对象的代理 - 为所有对象构建代理很困难但并非不可能(可解决).

Maybe there is no problem but a challenge. My impression was that a proxy has a manager and that a manager has its own process and so the unserializable object must be serialized and transfered (partially solved with StacklessPython/fork). Also there exist proxies for special objects - it is hard but not impossible to build a proxy for all objects (solvable).

解决方案?- 代理 + 经理?

Eric Urban 表明序列化不是问题.真正的挑战是在示例 2&3 中:状态的同步.我对解决方案的想法是为经理创建一个特殊的代理类.这个代理类

Eric Urban showed that serialization is not the problem. The real challenge is in Example2&3: the synchronization of state. My idea of a solution would be to create a special proxy class for a manager. This proxy class

  1. 为不可序列化的对象采用构造函数
  2. 获取一个可序列化的对象并将其传输到管理器进程.
  3. (问题)根据1.不可序列化的对象必须在管理器进程中创建.

推荐答案

在大多数情况下,将现有对象的引用传递给另一个进程并不是真正可取的.相反,您可以创建要在进程之间共享的类:

Most of the time it's not really desirable to pass the reference of an existing object to another process. Instead you create your class you want to share between processes:

class MySharedClass:
    # stuff...

然后你像这样创建一个代理管理器:

Then you make a proxy manager like this:

import multiprocessing.managers as m
class MyManager(m.BaseManager):
    pass # Pass is really enough. Nothing needs to be done here.

然后你在那个 Manager 上注册你的课程,就像这样:

Then you register your class on that Manager, like this:

MyManager.register("MySharedClass", MySharedClass)

然后,一旦管理器被实例化并启动,使用 manager.start(),您可以使用 manager.MySharedClass 创建类的共享实例.这应该适用于所有需求.返回的代理与原始对象完全一样,除了文档中描述的一些例外情况.

Then once the manager is instanciated and started, with manager.start() you can create shared instances of your class with manager.MySharedClass. This should work for all needs. The returned proxy works exactly like the original objects, except for some exceptions described in the documentation.

这篇关于多进程在进程之间共享不可序列化的对象的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆