使用Python泡菜存储对象,并将其加载到其他名称空间中 [英] Store object using Python pickle, and load it into different namespace

查看:56
本文介绍了使用Python泡菜存储对象,并将其加载到其他名称空间中的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想在两个Python程序之间传递对象状态(一个是我自己的代码独立运行,一个是金字塔视图)和其他名称空间.某些相关的问题是此处

I'd like to pass object state between two Python programs (one is my own code running standalone, one is a Pyramid view), and different namespaces. Somewhat related questions are here or here, but I can't quite follow through with them for my scenario.

我自己的代码定义了结构有点复杂的全局类(即__main__命名空间):

My own code defines a global class (i.e. __main__ namespace) of somewhat complexish structure:

# An instance of this is a colorful mess of nested lists and sets and dicts.
class MyClass :
    def __init__(self) :
        data = set()
        more = dict()
        ... 

    def do_sth(self) :
        ...

在某个时候,我腌制了该类的一个实例:

At some point I pickle an instance of this class:

c = MyClass()
# Fill c with data.

# Pickle and write the MyClass instance within the __main__ namespace.
with open("my_c.pik", "wb") as f :
    pickle.dump(c, f, -1)

A hexdump -C my_c.pik表明前几个字节包含__main__.MyClass,从中我假定确实在全局名称空间中定义了该类,并且从某种程度上来说,这是读取泡菜的必要条件.现在,我想从一个金字塔视图中加载这个腌制的MyClass实例,我认为它是一个不同的命名空间:

A hexdump -C my_c.pik shows that the first couple of bytes contain __main__.MyClass from which I assume that the class is indeed defined in the global namespace, and that this is somehow a requirement for reading the pickle. Now I'd like to load this pickled MyClass instance from within a Pyramid view, which I assume is a different namespace:

# In Pyramid (different namespace) read the pickled MyClass instance.
with open("my_c.pik", "rb") as f :
    c = pickle.load(f)

但这会导致以下错误:

File ".../views.py", line 60, in view_handler_bla
  c = pickle.load(f)
AttributeError: 'module' object has no attribute 'MyClass'

在我看来,MyClass定义在执行视图代码的任何命名空间中都丢失了吗?我曾希望(假定)酸洗是一个有点不透明的过程,它使我可以将一团数据读到我选择的任何位置. (有关Python类名和名称空间的更多信息,请参见此处.)

It seems to me that the MyClass definition is missing in whatever namespace the view code executes? I had hoped (assumed) that pickling is a somewhat opaque process which allows me to read a blob of data into whichever place I chose. (More on Python's class names and namespaces is here.)

我该如何正确处理? (理想情况下,无需跨所有导入...)我可以以某种方式找到当前名称空间并注入MyClass(例如

How can I handle this properly? (Ideally without having to import stuff across...) Can I somehow find the current namespace and inject MyClass (like this answer seems to suggest)?

解决方案不佳

在我看来,如果我避免定义和使用MyClass而是转而使用普通的内置数据类型,那么这将不是问题.实际上,我可以将MyClass对象序列化"为一系列的调用,这些调用可以对MyClass实例的各个元素进行腌制:

It seems to me that if I refrain from defining and using MyClass and instead fall back to plain built-in datatypes, this wouldn't be a problem. In fact, I could "serialize" the MyClass object into a sequence of calls that pickle the individual elements of the MyClass instance:

# 'Manual' serialization of c works, because all elements are built-in types.
pickle.dump(c.data, f, -1)
pickle.dump(c.more, f, -1)
...

尽管这样做会破坏将数据包装到类中的目的.

This would defeat the purpose of wrapping data into classes though.

注意

酸洗仅处理类的状态,而不处理在类范围内定义的任何函数(例如上例中的do_sth()).这意味着在没有正确的类定义的情况下将MyClass实例加载到不同的命名空间中只会加载实例数据.调用do_sth()之类的缺失函数将导致 AttributeError .

Pickling takes care only of the state of a class, not of any functions defined in the scope of the class (e.g. do_sth() in the above example). That means that loading a MyClass instance into a different namespace without the proper class definition loads only the instance data; calling a missing function like do_sth() will cause an AttributeError.

推荐答案

使用dill代替pickle,因为默认情况下dill通过序列化类定义而不是通过引用来腌制.

Use dill instead of pickle, because dill by default pickles by serializing the class definition and not by reference.

>>> import dill
>>> class MyClass:
...   def __init__(self): 
...     self.data = set()
...     self.more = dict()
...   def do_stuff(self):
...     return sorted(self.more)
... 
>>> c = MyClass()
>>> c.data.add(1)
>>> c.data.add(2)
>>> c.data.add(3)
>>> c.data
set([1, 2, 3])
>>> c.more['1'] = 1
>>> c.more['2'] = 2
>>> c.more['3'] = lambda x:x
>>> def more_stuff(self, x):  
...   return x+1
... 
>>> c.more_stuff = more_stuff
>>> 
>>> with open('my_c.pik', "wb") as f:
...   dill.dump(c, f)
... 
>>> 

关闭会话,然后在新会话中重新启动…

Shut down the session, and restart in a new session…

Python 2.7.8 (default, Jul 13 2014, 02:29:54) 
[GCC 4.2.1 Compatible Apple Clang 4.1 ((tags/Apple/clang-421.11.66))] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import dill
>>> with open('my_c.pik', "rb") as f:
...   c = dill.load(f)
... 
>>> c.data
set([1, 2, 3])
>>> c.more
{'1': 1, '3': <function <lambda> at 0x10473ec80>, '2': 2}
>>> c.do_stuff()
['1', '2', '3']
>>> c.more_stuff(5)
6

在此处获取dill: https://github.com/uqfoundation/dill

这篇关于使用Python泡菜存储对象,并将其加载到其他名称空间中的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆