Python scikit-学习JSON [英] Python scikit-learn to JSON

查看:90
本文介绍了Python scikit-学习JSON的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个使用Python scikit-learn构建的模型.我知道可以将模型保存为Pickle或Joblib格式.是否有现有方法将作业保存为JSON格式?请参阅下面的模型构建代码以供参考:

I have a model built with Python scikit-learn. I understand that the models can be saved in Pickle or Joblib formats. Are there any existing methods out there to save the jobs in JSON format? Please see the model build code below for reference:

import pandas
from sklearn import model_selection
from sklearn.linear_model import LogisticRegression
import pickle
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/pima-indians-diabetes/pima-indians-diabetes.data"
names =['preg', 'plas', 'pres', 'skin', 'test', 'mass', 'pedi', 'age', 'class']
dataframe = pandas.read_csv(url, names=names)
array = dataframe.values
X = array[:,0:8]
Y = array[:,8]
test_size = 0.33
seed = 7
X_train, X_test, Y_train, Y_test = model_selection.train_test_split(X, Y, test_size=test_size, random_state=seed)
# Fit the model on 33%
model = LogisticRegression()
model.fit(X_train, Y_train)
filename = 'finalized_model.sav'
pickle.dump(model, open(filename, 'wb'))

推荐答案

您将必须准备自己的序列化/反序列化方法.幸运的是,逻辑回归基本上可以通过系数和截距来捕获.但是,LogisticRegression对象保留了一些其他元数据,我们也可以围绕这些元数据进行捕获.我汇总了以下功能,它们完成了繁琐的工作.请记住,这仍然很粗糙:

You'll have to cook up your own serialization/deserialization recipe. Fortunately, logistic regression can basically be captured by the coefficients and the intercept. However, the LogisticRegression object keeps some other metadata around which we might as well capture. I threw together the following functions that does the dirty-work. Keep in mind, this is still rough:

import numpy as np
import json
from sklearn.linear_model import LogisticRegression

def logistic_regression_to_json(lrmodel, file=None):
    if file is not None:
        serialize = lambda x: json.dump(x, file)
    else:
        serialize = json.dumps
    data = {}
    data['init_params'] = lrmodel.get_params()
    data['model_params'] = mp = {}
    for p in ('coef_', 'intercept_','classes_', 'n_iter_'):
        mp[p] = getattr(lrmodel, p).tolist()
    return serialize(data)

def logistic_regression_from_json(jstring):
    data = json.loads(jstring)
    model = LogisticRegression(**data['init_params'])
    for name, p in data['model_params'].items():
        setattr(model, name, np.array(p))
    return model

请注意,只需'coef_', 'intercept_','classes_',您就可以自己进行预测,因为逻辑回归是直接的线性模型,因此它只是矩阵乘法.

Note, with just 'coef_', 'intercept_','classes_' you could do the predictions yourself, since logistic regression is a straight-forward linear model, it's simply matrix-multiplication.

这篇关于Python scikit-学习JSON的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆