如何从 GridSearchCV 绘制网格分数? [英] How to graph grid scores from GridSearchCV?

查看:33
本文介绍了如何从 GridSearchCV 绘制网格分数?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一种在 sklearn 中从 GridSearchCV 绘制 grid_scores_ 的方法.在这个例子中,我试图网格搜索 SVR 算法的最佳 gamma 和 C 参数.我的代码如下所示:

 C_range = 10.0 ** np.arange(-4, 4)gamma_range = 10.0 ** np.arange(-4, 4)param_grid = dict(gamma=gamma_range.tolist(), C=C_range.tolist())网格 = GridSearchCV(SVR(kernel='rbf', gamma=0.1),param_grid, cv=5)grid.fit(X_train,y_train)打印(grid.grid_scores_)

运行代码并打印网格分数后,我得到以下结果:

[mean: -3.28593, std: 1.69134, params: {'gamma': 0.0001, 'C': 0.0001}, mean: -3.29370, std: 1.69346, params: {'gamma', 0.001'C':0.0001},均值:-3.28933,标准:1.69104,参数:{'gamma':0.01,'C':0.0001},均值:-3.28925,标准:1.69106,参数:{',gamma':0'C':0.0001},均值:-3.28925,标准:1.69106,参数:{'gamma':1.0,'C':0.0001},均值:-3.28925,标准:1.69106,参数:{'gamma':10'C':0.0001} 等]

我想根据 gamma 和 C 参数可视化所有分数(平均值).我试图获得的图表应如下所示:

其中 x 轴是 gamma,y 轴是平均得分(本例中的均方根误差),不同的线代表不同的 C 值.

解决方案

from sklearn.svm import SVC从 sklearn.grid_search 导入 GridSearchCV从 sklearn 导入数据集导入 matplotlib.pyplot 作为 plt将 seaborn 作为 sns 导入将 numpy 导入为 np数字 = datasets.load_digits()X = 数字.数据y = 数字.目标clf_ = SVC(kernel='rbf')Cs = [1, 10, 100, 1000]伽马 = [1e-3, 1e-4]clf = GridSearchCV(clf_,dict(C=Cs,伽马=伽马),cv=2,pre_dispatch='1*n_jobs',n_jobs=1)clf.fit(X, y)score = [x[1] for x in clf.grid_scores_]分数 = np.array(scores).reshape(len(Cs), len(Gammas))对于 ind,我在 enumerate(Cs) 中:plt.plot(Gammas, score[ind], label='C:' + str(i))plt.legend()plt.xlabel('Gamma')plt.ylabel('平均分数')plt.show()

  • 代码基于

    I am looking for a way to graph grid_scores_ from GridSearchCV in sklearn. In this example I am trying to grid search for best gamma and C parameters for an SVR algorithm. My code looks as follows:

        C_range = 10.0 ** np.arange(-4, 4)
        gamma_range = 10.0 ** np.arange(-4, 4)
        param_grid = dict(gamma=gamma_range.tolist(), C=C_range.tolist())
        grid = GridSearchCV(SVR(kernel='rbf', gamma=0.1),param_grid, cv=5)
        grid.fit(X_train,y_train)
        print(grid.grid_scores_)
    

    After I run the code and print the grid scores I get the following outcome:

    [mean: -3.28593, std: 1.69134, params: {'gamma': 0.0001, 'C': 0.0001}, mean: -3.29370, std: 1.69346, params: {'gamma': 0.001, 'C': 0.0001}, mean: -3.28933, std: 1.69104, params: {'gamma': 0.01, 'C': 0.0001}, mean: -3.28925, std: 1.69106, params: {'gamma': 0.1, 'C': 0.0001}, mean: -3.28925, std: 1.69106, params: {'gamma': 1.0, 'C': 0.0001}, mean: -3.28925, std: 1.69106, params: {'gamma': 10.0, 'C': 0.0001},etc] 
    

    I would like to visualize all the scores (mean values) depending on gamma and C parameters. The graph I am trying to obtain should look as follows:

    Where x-axis is gamma, y-axis is mean score (root mean square error in this case), and different lines represent different C values.

    解决方案

    from sklearn.svm import SVC
    from sklearn.grid_search import GridSearchCV
    from sklearn import datasets
    import matplotlib.pyplot as plt
    import seaborn as sns
    import numpy as np
    
    digits = datasets.load_digits()
    X = digits.data
    y = digits.target
    
    clf_ = SVC(kernel='rbf')
    Cs = [1, 10, 100, 1000]
    Gammas = [1e-3, 1e-4]
    clf = GridSearchCV(clf_,
                dict(C=Cs,
                     gamma=Gammas),
                     cv=2,
                     pre_dispatch='1*n_jobs',
                     n_jobs=1)
    
    clf.fit(X, y)
    
    scores = [x[1] for x in clf.grid_scores_]
    scores = np.array(scores).reshape(len(Cs), len(Gammas))
    
    for ind, i in enumerate(Cs):
        plt.plot(Gammas, scores[ind], label='C: ' + str(i))
    plt.legend()
    plt.xlabel('Gamma')
    plt.ylabel('Mean score')
    plt.show()
    

    • Code is based on this.
    • Only puzzling part: will sklearn always respect the order of C & Gamma -> official example uses this "ordering"

    Output:

    这篇关于如何从 GridSearchCV 绘制网格分数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆