在 scikit-learn 中将分类器保存到磁盘 [英] Save classifier to disk in scikit-learn

查看:32
本文介绍了在 scikit-learn 中将分类器保存到磁盘的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如何将经过训练的朴素贝叶斯分类器保存到磁盘并用它预测数据?

How do I save a trained Naive Bayes classifier to disk and use it to predict data?

我有以下来自 scikit-learn 网站的示例程序:

I have the following sample program from the scikit-learn website:

from sklearn import datasets
iris = datasets.load_iris()
from sklearn.naive_bayes import GaussianNB
gnb = GaussianNB()
y_pred = gnb.fit(iris.data, iris.target).predict(iris.data)
print "Number of mislabeled points : %d" % (iris.target != y_pred).sum()

推荐答案

分类器只是可以像其他任何对象一样被腌制和转储的对象.继续你的例子:

Classifiers are just objects that can be pickled and dumped like any other. To continue your example:

import cPickle
# save the classifier
with open('my_dumped_classifier.pkl', 'wb') as fid:
    cPickle.dump(gnb, fid)    

# load it again
with open('my_dumped_classifier.pkl', 'rb') as fid:
    gnb_loaded = cPickle.load(fid)

如果您使用的是 sklearn 管道,其中您有无法通过 pickle 序列化的自定义转换器(也不能通过 joblib),然后使用 Neuraxle 的 自定义 ML 管道保存 是一种解决方案,您可以在其中定义自己的自定义 步骤节省器,按步骤计算.如果在保存时定义,则为每个步骤调用保存程序,否则将使用 joblib 作为没有保存程序的步骤的默认值.

if you are using a sklearn Pipeline in which you have custom transformers that cannot be serialized by pickle (nor by joblib), then using Neuraxle's custom ML Pipeline saving is a solution where you can define your own custom step savers on a per-step basis. The savers are called for each step if defined upon saving, and otherwise joblib is used as default for steps without a saver.

这篇关于在 scikit-learn 中将分类器保存到磁盘的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆