如何在(GridSearchCV)拟合模型后打印估计系数?(SGDRegressor) [英] how to print estimated coefficients after a (GridSearchCV) fit a model? (SGDRegressor)

查看:61
本文介绍了如何在(GridSearchCV)拟合模型后打印估计系数?(SGDRegressor)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是 scikit-learn 的新手,但它实现了我的期望.现在,令人抓狂的是,唯一剩下的问题是我不知道如何打印(或者更好的是,写入一个小文本文件)它估计的所有系数,它选择的所有特征.有什么方法可以做到这一点?

I am new to scikit-learn, but it did what I was hoping for. Now, maddeningly, the only remaining issue is that I don't find how I could print (or even better, write to a small text file) all the coefficients it estimated, all the features it selected. What is the way to do this?

与 SGDClassifier 相同,但我认为对于所有可以拟合的基础对象都是相同的,无论是否进行交叉验证.完整脚本如下.

Same with SGDClassifier, but I think it is the same for all base objects that can be fit, with cross validation or without. Full script below.

import scipy as sp
import numpy as np
import pandas as pd
import multiprocessing as mp
from sklearn import grid_search
from sklearn import cross_validation
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import SGDClassifier


def main():
    print("Started.")
    # n = 10**6
    # notreatadapter = iopro.text_adapter('S:/data/controls/notreat.csv', parser='csv')
    # X = notreatadapter[1:][0:n]
    # y = notreatadapter[0][0:n]
    notreatdata = pd.read_stata('S:/data/controls/notreat.dta')
    notreatdata = notreatdata.iloc[:10000,:]
    X = notreatdata.iloc[:,1:]
    y = notreatdata.iloc[:,0]
    n = y.shape[0]

    print("Data lodaded.")
    X_train, X_test, y_train, y_test = cross_validation.train_test_split(X, y, test_size=0.4, random_state=0)

    print("Data split.")
    scaler = StandardScaler()
    scaler.fit(X_train)  # Don't cheat - fit only on training data
    X_train = scaler.transform(X_train)
    X_test = scaler.transform(X_test)  # apply same transformation to test data

    print("Data scaled.")
    # build a model
    model = SGDClassifier(penalty='elasticnet',n_iter = np.ceil(10**6 / n),shuffle=True)
    #model.fit(X,y)

    print("CV starts.")
    # run grid search
    param_grid = [{'alpha' : 10.0**-np.arange(1,7),'l1_ratio':[.05, .15, .5, .7, .9, .95, .99, 1]}]
    gs = grid_search.GridSearchCV(model,param_grid,n_jobs=8,verbose=1)
    gs.fit(X_train, y_train)

    print("Scores for alphas:")
    print(gs.grid_scores_)
    print("Best estimator:")
    print(gs.best_estimator_)
    print("Best score:")
    print(gs.best_score_)
    print("Best parameters:")
    print(gs.best_params_)


if __name__=='__main__':
    mp.freeze_support()
    main()

推荐答案

装有最佳超参数的 SGDClassifier 实例存储在 gs.best_estimator_ 中.coef_intercept_ 是该最佳模型的拟合参数.

The SGDClassifier instance fitted with the best hyperparameters is stored in gs.best_estimator_. The coef_ and intercept_ are the fitted parameters of that best model.

这篇关于如何在(GridSearchCV)拟合模型后打印估计系数?(SGDRegressor)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆