如何使用joblib.dump在s3上保存sklearn模型? [英] How to save sklearn model on s3 using joblib.dump?

查看:659
本文介绍了如何使用joblib.dump在s3上保存sklearn模型?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个sklearn模型,我想使用joblib.dump将pickle文件保存在我的s3存储桶中

I have a sklearn model and I want to save the pickle file on my s3 bucket using joblib.dump

我使用joblib.dump(model, 'model.pkl')在本地保存模型,但是我不知道如何将其保存到s3存储桶.

I used joblib.dump(model, 'model.pkl') to save the model locally, but I do not know how to save it to s3 bucket.

s3_resource = boto3.resource('s3')
s3_resource.Bucket('my-bucket').Object("model.pkl").put(Body=joblib.dump(model, 'model.pkl'))

我希望将腌制后的文件放在我的s3存储桶中.

I expect the pickled file to be on my s3 bucket.

推荐答案

这是对我有用的方法.非常简单直接.我正在使用joblib(最好用于存储大型sklearn模型),但是您也可以使用pickle.
另外,我正在使用临时文件来往/从S3进行传输.但是,如果您愿意,可以将文件存储在更永久的位置.

Here's a way that worked for me. Pretty straight forward and easy. I'm using joblib (it's better for storing large sklearn models) but you could use pickle too.
Also, I'm using temporary files for transferring to/from S3. But if you want, you could store the file in a more permanent location.

import tempfile
import boto3
import joblib

bucket_name = "my-bucket"
key = "model.pkl"

# WRITE
with tempfile.TemporaryFile() as fp:
    joblib.dump(model, fp)
    fp.seek(0)
    s3_resource.put_object(Body=fp.read(), Bucket=bucket_name, Key=key)

# READ
with tempfile.TemporaryFile() as fp:
    s3_resource.download_fileobj(Fileobj=fp, Bucket=bucket_name, Key=key)
    fp.seek(0)
    model = joblib.load(fp)

# DELETE
s3_resource.delete_object(Bucket=bucket_name, Key=key)

这篇关于如何使用joblib.dump在s3上保存sklearn模型?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆