读取泡菜文件时发生AttributeError [英] AttributeError when reading a pickle file

查看:121
本文介绍了读取泡菜文件时发生AttributeError的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

当我在spyder(python 3.6.5)上读取.pkl文件时出现以下错误:

I get the following error when I'm reading my .pkl files on spyder (python 3.6.5):

IN: with open(file, "rb") as f:
       data = pickle.load(f)  

Traceback (most recent call last):

 File "<ipython-input-5-d9796b902b88>", line 2, in <module>
   data = pickle.load(f)

AttributeError: Can't get attribute 'Signal' on <module '__main__' from 'C:\\Python36\\lib\\site-packages\\spyder\\utils\\ipython\\start_kernel.py'>

上下文:

我的程序由一个文件组成:program.py 在程序中,定义了类Signal以及许多功能.下面提供了该程序的简化概述:

My program is made of one file: program.py In the program, a class Signal is defined as well as many functions. A simplified overview of the program is provided below:

import numpy as np
import _pickle as pickle
import os

# The unique class
class Signal:
    def __init__(self, fq, t0, tf):
        self.fq = fq
        self.t0 = t0
        self.tf = tf
        self.timeline = np.round(np.arange(t0, tf, 1/fq*1000), 3)

# The functions
def write_file(data, folder_path, file_name):
    with open(join(folder_path, file_name), "wb") as output:
        pickle.dump(data, output, -1)

def read_file(folder_path, file_name):
    with open(join(folder_path, file_name), "rb") as input:
        data= pickle.load(input)
    return data

def compute_data(# parameters):
    # do stuff

函数compute_data将返回以下形式的元组列表:

The function compute_data will return a list of tuples of the form:

data = [((Signal_1_1, Signal_1_2, ...), val 1), ((Signal_2_1, Signal_2_2, ...), val 2)...]

当然,Signal_i_k是对象Signal.此列表将以.pkl格式保存.此外,我使用compute_data函数的不同参数进行了大量迭代.许多迭代将过去的计算数据作为起点,从而读取相应的和所需的.pkl文件.

With, of course, the Signal_i_k being an object Signal. This list will be saved in .pkl format. Moreover, I'm doing a lot of iteration with different parameters for the compute_data functions. Many iterations will use past computed data as a starting point, and thus will read the corresponding and needed .pkl files.

最后,我同时使用多台计算机,每台计算机都将计算出的数据保存在本地网络上.因此,每台计算机都可以访问其他计算机生成的数据,并将其用作起点.

Finally, I'm using several computers at the same time, each of them saving the computed data on the local network. Thus each computer can access the data generated by the others and use it as a starting point.

返回错误:

我的主要问题是,通过双击文件或Windows cmd或PowerShell启动程序时,永远不会出现此错误.该程序永远不会崩溃引发此错误,并且运行时不会出现明显问题.

My main issue is that I never have this error when I start my programs by double-clicking the file or by the windows cmd or PowerShell. The program never crashes throwing this error and runs without apparent issues.

但是,我无法在spyder中读取.pkl文件.每次尝试时,都会引发错误.

知道我为什么会出现这种怪异行为吗?

Any idea why I got this weird behavior?

谢谢!

推荐答案

pickle中转储内容时,应避免对主模块中声明的类和函数进行酸洗.您的问题是(部分地)因为程序中只有一个文件. pickle是惰性的,不会序列化类定义或函数定义.相反,它保存了有关如何查找类的参考(类所在的模块及其名称).

When you dump stuff in a pickle you should avoid pickling classes and functions declared in the main module. Your problem is (in part) because you only have one file in your program. pickle is lazy and does not serialize class definitions or function definitions. Instead it saves a reference of how to find the class (the module it lives in and its name).

当python直接运行脚本/文件时,它将作为__main__模块运行该程序(无论其实际文件名如何).但是,当文件被加载并且不是主模块时(例如,当您执行import program之类的操作时),则其模块名称将基于其名称.因此program.py被称为program.

When python runs a script/file directly it runs the program as the __main__ module (regardless of its actual file name). However, when a file is loaded and is not the main module (eg. when you do something like import program) then its module name is based on its name. So program.py gets called program.

从命令行运行时,您正在执行前者,该模块称为__main__.因此,pickle会创建对您的类的引用,例如__main__.Signal.当spyder尝试加载泡菜文件时,系统会告知其导入__main__并查找Signal.但是,spyder的__main__模块是用于启动spyder的模块,而不是您的program.py的模块,因此pickle无法找到Signal.

When you are running from the command line you are doing the former, and the module is called __main__. As such, pickle creates references to your classes like __main__.Signal. When spyder tries to load the pickle file it gets told to import __main__ and look for Signal. But, spyder's __main__ module is the module that is used to start spyder and not your program.py and so pickle fails to find Signal.

您可以通过运行来检查pickle文件的内容(-a显示每个命令的描述).从中您将看到您的班级被引用为__main__.Signal.

You can inspect the contents of a pickle file by running (-a is prints a description of each command). From this you will see that your class is being referenced as __main__.Signal.

python -m pickletools -a file.pkl

您会看到类似的内容:

    0: \x80 PROTO      3              Protocol version indicator.
    2: c    GLOBAL     '__main__ Signal' Push a global object (module.attr) on the stack.
   19: q    BINPUT     0                 Store the stack top into the memo.  The stack is not popped.
   21: )    EMPTY_TUPLE                  Push an empty tuple.
   22: \x81 NEWOBJ                       Build an object instance.
   23: q    BINPUT     1                 Store the stack top into the memo.  The stack is not popped.
   ...
   51: b    BUILD                        Finish building an object, via __setstate__ or dict update.
   52: .    STOP                         Stop the unpickling machine.
highest protocol among opcodes = 2

解决方案

有许多解决方案可供您使用:

Solutions

There are a number of solutions available to you:

  1. 不要序列化__main__模块中定义的类的实例.最简单,最好的解决方案.而是将这些类移至另一个模块,或编写main.py脚本以调用您的程序(这都意味着在__main__模块中不再找到此类类).
  2. 编写自定义解串器
  3. 编写自定义序列化器
  1. Don't serialise instances of classes that are defined in your __main__ module. The easiest and best solution. Instead move these classes to another module, or write a main.py script to invoke your program (both will mean such classes are no longer found in the __main__ module).
  2. Write a custom derserialiser
  3. Write a custom serialiser

以下解决方案将与由以下代码创建的名为out.pkl的泡菜文件(位于名为program.py的文件中)一起工作:

The following solutions will be working with a pickle file called out.pkl created by the following code (in a file called program.py):

import pickle

class MyClass:
    def __init__(self, name):
        self.name = name

if __name__ == '__main__':
    o = MyClass('test')
    with open('out.pkl', 'wb') as f:
        pickle.dump(o, f)

自定义反序列化程序解决方案

您可以编写一个客户反序列化器,当它遇到对__main__模块的引用时,您便知道真正的意思是program模块.

The Custom Deserialiser Solution

You can write a customer deserialiser that knows when it encounters a reference to the __main__ module what you really mean is the program module.

import pickle

class MyCustomUnpickler(pickle.Unpickler):
    def find_class(self, module, name):
        if module == "__main__":
            module = "program"
        return super().find_class(module, name)

with open('out.pkl', 'rb') as f:
    unpickler = MyCustomUnpickler(f)
    obj = unpickler.load()

print(obj)
print(obj.name)

这是加载已创建的泡菜文件的最简单方法.该程序是将责任推给反序列化代码,而正确地创建泡菜文件应由序列化代码负责.

This is the easiest way to load pickle files that have already been created. The program is that it pushes the responsibility on to the deserialising code, when it should really be the responsibility of the serialising code to create pickle files correctly.

与之前的解决方案相比,您可以确保任何人都可以轻松地对序列化的pickle对象进行反序列化,而无需了解自定义反序列化逻辑.为此,您可以使用 copyreg 模块来通知pickle如何反序列化各种类.因此,在这里,您要做的是告诉pickle将所有__main__类的实例反序列化,就像它们是program类的实例一样.您将需要为每个类注册一个自定义序列化程序

In contrast to the previous solution you can make sure that serialised pickle objects can be deserialised easily by anyone without having to know the custom deserialisation logic. To do this you can use the copyreg module to inform pickle how to deserialise various classes. So here, what you would do is tell pickle to deserialise all instances of __main__ classes as if they were instances of program classes. You will need to register a custom serialiser for each class

import program
import pickle
import copyreg

class MyClass:
    def __init__(self, name):
        self.name = name

def pickle_MyClass(obj):
    assert type(obj) is MyClass
    return program.MyClass, (obj.name,)

copyreg.pickle(MyClass, pickle_MyClass)

if __name__ == '__main__':
    o = MyClass('test')
    with open('out.pkl', 'wb') as f:
        pickle.dump(o, f)

这篇关于读取泡菜文件时发生AttributeError的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆