在 azure ml 部署环境中导入自定义 python 模块 [英] import custom python module in azure ml deployment environment

查看:72
本文介绍了在 azure ml 部署环境中导入自定义 python 模块的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个 sklearn k-means 模型.我正在训练模型并将其保存在 pickle 文件中,以便稍后使用 azure ml 库进行部署.我正在训练的模型使用名为 MultiColumnLabelEncoder 的自定义特征编码器.管道模型定义如下:

I have an sklearn k-means model. I am training the model and saving it in a pickle file so I can deploy it later using azure ml library. The model that I am training uses a custom Feature Encoder called MultiColumnLabelEncoder. The pipeline model is defined as follow :

# Pipeline
kmeans = KMeans(n_clusters=3, random_state=0)
pipe = Pipeline([
("encoder", MultiColumnLabelEncoder()),
('k-means', kmeans),
])
#Training the pipeline
model = pipe.fit(visitors_df)
prediction = model.predict(visitors_df)
#save the model in pickle/joblib format
filename = 'k_means_model.pkl'
joblib.dump(model, filename)

模型保存工作正常.部署步骤与此链接中的步骤相同:

The model saving works fine. The Deployment steps are the same as the steps in this link :

https://notebooks.azure.com/azureml/projects/azureml-getting-started/html/how-to-use-azureml/deploy-to-cloud/model-register-and-deploy.ipynb

但是部署总是失败并出现此错误:

However the deployment always fails with this error :

  File "/var/azureml-server/create_app.py", line 3, in <module>
    from app import main
  File "/var/azureml-server/app.py", line 27, in <module>
    import main as user_main
  File "/var/azureml-app/main.py", line 19, in <module>
    driver_module_spec.loader.exec_module(driver_module)
  File "/structure/azureml-app/score.py", line 22, in <module>
    importlib.import_module("multilabelencoder")
  File "/azureml-envs/azureml_b707e8c15a41fd316cf6c660941cf3d5/lib/python3.6/importlib/__init__.py", line 126, in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
ModuleNotFoundError: No module named 'multilabelencoder'

我知道pickle/joblib 在解压自定义函数MultiLabelEncoder 时存在一些问题.这就是为什么我在一个单独的 python 脚本(我也执行过)中定义了这个类.我在训练 python 脚本、部署脚本和评分 python 文件 (score.py) 中调用了这个自定义函数.score.py 文件中的导入不成功.所以我的问题是如何将自定义 python 模块导入 azure ml 部署环境?

I understand that pickle/joblib has some problems unpickling the custom function MultiLabelEncoder. That's why I defined this class in a separate python script (which I executed also). I called this custom function in the training python script, in the deployment script and in the scoring python file (score.py). The importing in the score.py file is not successful. So my question is how can I import custom python module to azure ml deployment environment ?

提前致谢.

这是我的 .yml 文件

This is my .yml file

name: project_environment
dependencies:
  # The python interpreter version.
  # Currently Azure ML only supports 3.5.2 and later.
- python=3.6.2

- pip:
  - multilabelencoder==1.0.4
  - scikit-learn
  - azureml-defaults==1.0.74.*
  - pandas
channels:
- conda-forge

推荐答案

其实解决方案是将我自定义的类MultiColumnLabelEncoder导入为pip包(可以通过pip install multilllabelencoder==1.0.5).然后我把pip包传到.yml文件或者azure ml环境的InferenceConfig里面.在 score.py 文件中,我按如下方式导入了类:

In fact, the solution was to import my customized class MultiColumnLabelEncoder as a pip package (You can find it through pip install multilllabelencoder==1.0.5). Then I passed the pip package to the .yml file or in the InferenceConfig of the azure ml environment. In the score.py file, I imported the class as follows :

from multilabelencoder import multilabelencoder
def init():
    global model

    # Call the custom encoder to be used dfor unpickling the model
    encoder = multilabelencoder.MultiColumnLabelEncoder() 
    # Get the path where the deployed model can be found.
    model_path = os.path.join(os.getenv('AZUREML_MODEL_DIR'), 'k_means_model_45.pkl')
    model = joblib.load(model_path)

然后部署成功.更重要的一件事是我必须在训练管道中使用与此处相同的 pip 包(multilabelencoder):

Then the deployment was successful. One more important thing is I had to use the same pip package (multilabelencoder) in the training pipeline as here :

from multilabelencoder import multilabelencoder 
pipe = Pipeline([
    ("encoder", multilabelencoder.MultiColumnLabelEncoder(columns)),
    ('k-means', kmeans),
])
#Training the pipeline
trainedModel = pipe.fit(df)

这篇关于在 azure ml 部署环境中导入自定义 python 模块的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆