泡菜转储的多处理队列问题 [英] multiprocessing queue issue with pickle dumps

查看:42
本文介绍了泡菜转储的多处理队列问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经反复阅读有关多处理模块和队列管理的 Python 文档,但找不到与此问题相关的任何内容,这让我发疯并阻止了我的项目:

I have read and read again the Python documentation about multiprocessing module and Queues management but I cannot find anything related to this issue that turns me crazy and is blocking my project:

我编写了一个JsonLike"类,它允许我创建一个对象,例如:

I wrote a 'JsonLike' class which allows me to create an object such as :

a = JsonLike()
a.john.doe.is.here = True

...不考虑中间初始化(非常有用)

...without considering intermediate initialization (very useful)

下面的代码只是创建了这样一个对象,将它设置并插入到一个数组中,然后尝试将它发送到一个进程(这正是我需要的,但是发送对象本身会导致同样的错误强>)

The following code just creates such an object, set and insert it in a array and tries to send that to a process (this is what I need but the sending of the object itself leads to the same error)

考虑这段代码:

from multiprocessing import Process, Queue, Event

class JsonLike(dict):
    """
    This class allows json-crossing-through creation and setting such as :
    a = JsonLike()
    a.john.doe.is.here = True
    it automatically creates all the hierarchy
    """

    def __init__(self, *args, **kwargs):
        # super(JsonLike, self).__init__(*args, **kwargs)
        dict.__init__(self, *args, **kwargs)
        for arg in args:
            if isinstance(arg, dict):
                for k, v in arg.items():
                    self[k] = v
        if kwargs:
            for k, v in kwargs.items():
                self[k] = v

    def __getattr__(self, attr):
        if self.get(attr) != None:
            return attr
        else:
            newj = JsonLike()
            self.__setattr__(attr, newj)
            return newj

    def __setattr__(self, key, value):
        self.__setitem__(key, value)

    def __setitem__(self, key, value):
        dict.__setitem__(self, key, value)
        self.__dict__.update({key: value})

    def __delattr__(self, item):
        self.__delitem__(item)

    def __delitem__(self, key):
        dict.__delitem__(self, key)
        del self.__dict__[key]


def readq(q, e):
    while True:
        obj = q.get()
        print('got')
        if e.is_set():
            break


if __name__ == '__main__':
    q = Queue()
    e = Event()

    obj = JsonLike()
    obj.toto = 1

    arr=[obj]

    proc = Process(target=readq, args=(q,e))
    proc.start()
    print(f"Before sending value :{arr}")
    q.put(arr)
    print('sending done')
    e.set()
    proc.join()
    proc.close()

我得到以下输出(在 q.put 上):

I get the following output (on the q.put):

Before sending value :[{'toto': 1}]
Traceback (most recent call last):
sending done
  File "/usr/lib/python3.7/multiprocessing/queues.py", line 236, in _feed
    obj = _ForkingPickler.dumps(obj)
  File "/usr/lib/python3.7/multiprocessing/reduction.py", line 51, in dumps
    cls(buf, protocol).dump(obj)
TypeError: 'JsonLike' object is not callable

有什么建议吗?

推荐答案

问题是你在搞乱 __getattr__.如果在这个方法中添加一个打印语句,你会看到运行以下代码也会导致崩溃:

The problem is that you are messing with __getattr__. If you add a print statement inside this method, you will see that running the following code leads to a crash too:

obj = JsonLike()
obj.toto.test = 1

q = Queue()
q.put(obj)
q.get()

最后一条语句将导致调用(重复)obj.__getattr__,搜索名为 __getstate__(稍后它会尝试找到它的朋友 __setstate__).以下是 pickle 文档对这种 dunder 方法的说明:

This last statement will result in calling (repeatedly) obj.__getattr__, searching for an attribute named __getstate__ (it will later try to find its friend __setstate__). Here's what the pickle documentations says about this dunder method:

如果 __getstate__() 方法不存在,实例的 __dict__ 会像往常一样被腌制.

If the __getstate__() method is absent, the instance’s __dict__ is pickled as usual.

在您的情况下,问题是此方法不存在,但您的代码使它看起来像它(通过动态创建具有正确名称的属性).因此不会触发默认行为,而是调用名为 __getstate__ 的空属性.问题是 __getstate__ 不是可调用的,因为它是一个空的 JsonLike 对象.这就是为什么您可能会在此处看到JsonLike 不可调用"之类的错误弹出窗口.

In your case the problem is that this method doesn't exist, but your code make it look like it does (by creating an attribute with the right name on the fly). Therefore the default behavior is not triggered, instead an empty attribute named __getstate__ is called. The problem is that __getstate__ is not a callable as it's an empty JsonLike object. This is why you may see errors like "JsonLike is not callable" pop-up here.

一个快速解决方法是避免触及看起来像 __xx__ 甚至 _xx 的属性.为此,您可以添加/修改这些行:

One quick fix is to avoid touching attributes that look like __xx__ and even _xx. To that matter you can add/modify these lines:

import re

dunder_pattern = re.compile("__.*__")
protected_pattern = re.compile("_.*")

class JsonLike(dict):

    def __getattr__(self, attr):
        if dunder_pattern.match(attr) or protected_pattern.match(attr):
            return super().__getattr__(attr)
        if self.get(attr) != None:
            return attr
        else:
            newj = JsonLike()
            self.__setattr__(attr, newj)
            return newj

这将使以前的代码工作(同样适用于您的代码).但另一方面,您将无法再编写 obj.__toto__ = 1 之类的东西,但这无论如何可能是一件好事.

Which will allow to make the previous code work (same goes for your code). But on the other hand, you won't be able to write things like obj.__toto__ = 1 anymore, but that's probably a good thing anyway.

我觉得您可能会在其他上下文中遇到类似的错误,遗憾的是,在某些情况下,您会发现不会使用此类可预测属性名称的库.这就是我不建议使用这种机制 IRL 的原因之一(尽管我真的很喜欢这个想法,我很想看看这能走多远).

I feel like you may end-up with similar bugs in other contexts and sadly, in some cases you will find libraries that won't use such predictable attributes names. That's one of the reasons why I wouldn't suggest to use such a mechanism IRL (even though I really like the idea and I would love to see how far this can go).

这篇关于泡菜转储的多处理队列问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆