Python scikit-学习JSON [英] Python scikit-learn to JSON
问题描述
我有一个使用Python scikit-learn构建的模型.我知道可以将模型保存为Pickle或Joblib格式.是否有现有方法将作业保存为JSON格式?请参阅下面的模型构建代码以供参考:
I have a model built with Python scikit-learn. I understand that the models can be saved in Pickle or Joblib formats. Are there any existing methods out there to save the jobs in JSON format? Please see the model build code below for reference:
import pandas
from sklearn import model_selection
from sklearn.linear_model import LogisticRegression
import pickle
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/pima-indians-diabetes/pima-indians-diabetes.data"
names =['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']
dataframe = pandas.read_csv(url, names=names)
array = dataframe.values
X = array[:,0:8]
Y = array[:,8]
test_size = 0.33
seed = 7
X_train, X_test, Y_train, Y_test = model_selection.train_test_split(X, Y, test_size=test_size, random_state=seed)
# Fit the model on 33%
model = LogisticRegression()
model.fit(X_train, Y_train)
filename = 'finalized_model.sav'
pickle.dump(model, open(filename, 'wb'))
推荐答案
您将必须准备自己的序列化/反序列化方法.幸运的是,逻辑回归基本上可以通过系数和截距来捕获.但是,LogisticRegression
对象保留了一些其他元数据,我们也可以围绕这些元数据进行捕获.我汇总了以下功能,它们完成了繁琐的工作.请记住,这仍然很粗糙:
You'll have to cook up your own serialization/deserialization recipe. Fortunately, logistic regression can basically be captured by the coefficients and the intercept. However, the LogisticRegression
object keeps some other metadata around which we might as well capture. I threw together the following functions that does the dirty-work. Keep in mind, this is still rough:
import numpy as np
import json
from sklearn.linear_model import LogisticRegression
def logistic_regression_to_json(lrmodel, file=None):
if file is not None:
serialize = lambda x: json.dump(x, file)
else:
serialize = json.dumps
data = {}
data['init_params'] = lrmodel.get_params()
data['model_params'] = mp = {}
for p in ('coef_', 'intercept_','classes_', 'n_iter_'):
mp[p] = getattr(lrmodel, p).tolist()
return serialize(data)
def logistic_regression_from_json(jstring):
data = json.loads(jstring)
model = LogisticRegression(**data['init_params'])
for name, p in data['model_params'].items():
setattr(model, name, np.array(p))
return model
请注意,只需'coef_', 'intercept_','classes_'
,您就可以自己进行预测,因为逻辑回归是直接的线性模型,因此它只是矩阵乘法.
Note, with just 'coef_', 'intercept_','classes_'
you could do the predictions yourself, since logistic regression is a straight-forward linear model, it's simply matrix-multiplication.
这篇关于Python scikit-学习JSON的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!