joblib.load __main__ AttributeError [英] joblib.load __main__ AttributeError

查看:119
本文介绍了joblib.load __main__ AttributeError的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我开始潜入使用Flask将预测模型部署到Web应用程序的过程,但不幸地陷入了起步阶段.

I'm starting to dive into deploying a predictive model to a web app using Flask, and unfortunately getting stuck at the starting gate.

我做什么:

我在我的 model.py 程序中腌制了我的模型:

I pickled my model in my model.py program:

import numpy as np
from sklearn.externals import joblib

class NeuralNetwork():
    """
    Two (hidden) layer neural network model. 
    First and second layer contain the same number of hidden units
    """
    def __init__(self, input_dim, units, std=0.0001):
        self.params = {}
        self.input_dim = input_dim

        self.params['W1'] = np.random.rand(self.input_dim, units)
        self.params['W1'] *= std
        self.params['b1'] = np.zeros((units))

        self.params['W2'] = np.random.rand(units, units)
        self.params['W2'] *= std * 10  # Compensate for vanishing gradients
        self.params['b2'] = np.zeros((units))

        self.params['W3'] = np.random.rand(units, 1)
        self.params['b3'] = np.zeros((1,))

model = NeuralNetwork(input_dim=12, units=64)

#####THIS RIGHT HERE ##############
joblib.dump(model, 'demo_model.pkl')

然后按照本教程,在与 demo_model.pkl 相同的目录中创建一个 api.py 文件(

then I created an api.py file in the same directory as my demo_model.pkl, per this tutorial (https://blog.hyperiondev.com/index.php/2018/02/01/deploy-machine-learning-models-flask-api/):

import flask
from flask import Flask, render_template, request
from sklearn.externals import joblib

app = Flask(__name__)


@app.route("/")
@app.route("/index")
def index():
    return flask.render_template('index.html')


# create endpoint for the predictions (HTTP POST requests)
@app.route('/predict', methods=['POST'])
def make_prediction():
    if request.method == 'POST':
        return render_template('index.html', label='3')


if __name__ == '__main__':
    # LOAD MODEL WHEN APP RUNS ####
    model = joblib.load('demo_model.pkl')
    app.run(host='0.0.0.0', port=8000, debug=True)

我还使用以下信息在同一目录中创建了一个template/index.html文件:

I also made a templates/index.html file in the same directory with this info:

<html>
    <head>
        <title>NN Model as Flask API</title>
        <meta charset="utf-8">
        <meta name="viewport" content="width=device-width, initial-scale=1">
    </head>
    <body>
        <h1>Boston Housing Price Predictor</h1>
        <form action="/predict" method="post" enctype="multipart/form-data">
            <input type="file" name="image" value="Upload">
            <input type="submit" value="Predict"> {% if label %} {{ label }} {% endif %}
        </form>
    </body>

</html>

运行:

>> python api.py

给我一​​个泡菜错误:

gives me an error with the pickler:

Traceback (most recent call last):
  File "api.py", line 22, in <module>
    model = joblib.load('model.pkl')
  File "C:\Users\joshu\Anaconda3\lib\site-packages\sklearn\externals\joblib\numpy_pickle.py", line 578, in load
    obj = _unpickle(fobj, filename, mmap_mode)
  File "C:\Users\joshu\Anaconda3\lib\site-packages\sklearn\externals\joblib\numpy_pickle.py", line 508, in _unpickle
    obj = unpickler.load()
  File "C:\Users\joshu\Anaconda3\lib\pickle.py", line 1043, in load
    dispatch[key[0]](self)
  File "C:\Users\joshu\Anaconda3\lib\pickle.py", line 1342, in load_global
    klass = self.find_class(module, name)
  File "C:\Users\joshu\Anaconda3\lib\pickle.py", line 1396, in find_class
    return getattr(sys.modules[module], name)
AttributeError: module '__main__' has no attribute 'NeuralNetwork'

为什么程序的主要模块与我的NeuralNetwork模型有关?此刻我很困惑……任何建议都将不胜感激.

Why is the main module of the program getting involved with my NeuralNetwork model? I'm very confused at the moment... any advice would be appreciated.

更新:

在我的 api.py 程序中添加类定义class NeuralNetwork(object): pass修复了该错误.

Adding a class definition class NeuralNetwork(object): pass to my api.py program fixed the bug.

import flask
from flask import Flask, render_template, request
from sklearn.externals import joblib


class NeuralNetwork(object):
    pass


app = Flask(__name__)

如果有人愿意向我解释发生的事情,将不胜感激!

If anyone would be willing to offer me an explanation of what was going on that would be hugely appreciated!

推荐答案

要获取的特定异常是指__main__中的属性,但这主要是红色鲱鱼.我很确定问题实际上与您转储实例的方式有关.

The specific exception you're getting refers to attributes in __main__, but that's mostly a red herring. I'm pretty sure the issue actually has to do with how you dumped the instance.

Pickle不会转储实际的代码类和函数,仅转储其名称.它包含每个模块在其中定义的模块名称,因此可以再次找到它们.如果转储正在作为脚本运行的模块中定义的类,则它将转储名称__main__作为模块名称,因为这就是Python用作主模块的名称(如if __name__ == "__main__"所示)样板代码).

Pickle does not dump the actual code classes and functions, only their names. It includes the name of the module each one was defined in, so it can find them again. If you dump a class defined in a module you're running as a script, it will dump the name __main__ as the module name, since that's what Python uses as the name for the main module (as seen in the if __name__ == "__main__" boilerplate code).

当您将model.py作为脚本运行并腌制其中定义的类的实例时,该类将另存为__main__.NeuralNetwork而不是model.NeuralNetwork.当您运行其他模块并尝试加载pickle文件时,Python会在__main__模块中查找该类,因为那是pickle数据告诉它查找的地方.这就是为什么您在__main__属性上遇到异常的原因.

When you run model.py as a script and pickle an instance of a class defined in it, that class will be saved as __main__.NeuralNetwork rather than model.NeuralNetwork. When you run some other module and try to load the pickle file, Python will look for the class in the __main__ module, since that's where the pickle data tells it to look. This is why you're getting an exception about attributes of __main__.

要解决此问题,您可能需要更改转储数据的方式.除了可能将model.py作为脚本运行之外,您可能还应该运行其他模块并使它执行import model,因此您可以使用其常规名称获取该模块. (我想您可以将model.py本身导入到if __name__ == "__main__"块中,但这非常丑陋且笨拙).您可能还需要避免在导入model时无条件地重新创建和转储实例,因为这需要在加载pickle文件时发生(并且我认为pickle的全部目的是避免从头开始重新创建实例)

To solve this you probably want to change how you're dumping the data. Instead of running model.py as a script, you should probably run some other module and have it do import model, so you get the module under it's normal name. (I suppose you could have model.py import itself in an if __name__ == "__main__" block, but that's super ugly and awkward). You probably also need to avoid recreating and dumping the instance unconditionally when the model is imported, since that needs to happen when you load the pickle file (and I assume the whole point of the pickle is to avoid recreating the instance from scratch).

因此,请从model.py的底部删除转储逻辑,并添加一个新文件,如下所示:

So remove the dumping logic from the bottom of model.py, and add a new file like this:

# new script, dump_model.py, does the creation and dumping of the NeuralNetwork

from sklearn.externals import joblib

from model import NeuralNetwork

if __name__ == "__main__":
    model = NeuralNetwork(input_dim=12, units=64)
    joblib.dump(model, 'demo_model.pkl')

使用此脚本转储NeuralNetwork时,它将正确地将model标识为定义该类的模块,因此加载代码将能够导入该模块并正确创建该类的实例

When you dump the NeuralNetwork using this script, it will correctly identify model as the module the class was defined in, and so the loading code will be able to import that module and make an instance of the class correctly.

您当前针对该问题的修复"(在加载对象时在__main__模块中定义一个空的NeuralNetwork类)可能是一个不好的解决方案.从加载pickle文件中获得的实例将是新类的实例,而不是原始类.它将加载旧实例的属性,但不会设置任何方法或其他类变量(这与您显示的类无关,但可能适用于任何实例更复杂的对象).

Your current "fix" for the issue (defining an empty NeuralNetwork class in the __main__ module when you are loading the object) is probably a bad solution. The instance you get from loading the pickle file will be an instance of the new class, not the original one. It will be loaded with the attributes of the old instance, but it won't have any methods or other class variables set on it (which isn't an issue with the class you've shown, but probably will be for any kind of object that's more complicated).

这篇关于joblib.load __main__ AttributeError的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆