如何从 GridSearchCV 绘制网格分数? [英] How to graph grid scores from GridSearchCV?
问题描述
我正在寻找一种在 sklearn 中从 GridSearchCV 绘制 grid_scores_ 的方法.在这个例子中,我试图网格搜索 SVR 算法的最佳 gamma 和 C 参数.我的代码如下所示:
C_range = 10.0 ** np.arange(-4, 4)gamma_range = 10.0 ** np.arange(-4, 4)param_grid = dict(gamma=gamma_range.tolist(), C=C_range.tolist())网格 = GridSearchCV(SVR(kernel='rbf', gamma=0.1),param_grid, cv=5)grid.fit(X_train,y_train)打印(grid.grid_scores_)
运行代码并打印网格分数后,我得到以下结果:
[mean: -3.28593, std: 1.69134, params: {'gamma': 0.0001, 'C': 0.0001}, mean: -3.29370, std: 1.69346, params: {'gamma', 0.001'C':0.0001},均值:-3.28933,标准:1.69104,参数:{'gamma':0.01,'C':0.0001},均值:-3.28925,标准:1.69106,参数:{',gamma':0'C':0.0001},均值:-3.28925,标准:1.69106,参数:{'gamma':1.0,'C':0.0001},均值:-3.28925,标准:1.69106,参数:{'gamma':10'C':0.0001} 等]
我想根据 gamma 和 C 参数可视化所有分数(平均值).我试图获得的图表应如下所示:
其中 x 轴是 gamma,y 轴是平均得分(本例中的均方根误差),不同的线代表不同的 C 值.
from sklearn.svm import SVC从 sklearn.grid_search 导入 GridSearchCV从 sklearn 导入数据集导入 matplotlib.pyplot 作为 plt将 seaborn 作为 sns 导入将 numpy 导入为 np数字 = datasets.load_digits()X = 数字.数据y = 数字.目标clf_ = SVC(kernel='rbf')Cs = [1, 10, 100, 1000]伽马 = [1e-3, 1e-4]clf = GridSearchCV(clf_,dict(C=Cs,伽马=伽马),cv=2,pre_dispatch='1*n_jobs',n_jobs=1)clf.fit(X, y)score = [x[1] for x in clf.grid_scores_]分数 = np.array(scores).reshape(len(Cs), len(Gammas))对于 ind,我在 enumerate(Cs) 中:plt.plot(Gammas, score[ind], label='C:' + str(i))plt.legend()plt.xlabel('Gamma')plt.ylabel('平均分数')plt.show()
- 代码基于
I am looking for a way to graph grid_scores_ from GridSearchCV in sklearn. In this example I am trying to grid search for best gamma and C parameters for an SVR algorithm. My code looks as follows:
C_range = 10.0 ** np.arange(-4, 4) gamma_range = 10.0 ** np.arange(-4, 4) param_grid = dict(gamma=gamma_range.tolist(), C=C_range.tolist()) grid = GridSearchCV(SVR(kernel='rbf', gamma=0.1),param_grid, cv=5) grid.fit(X_train,y_train) print(grid.grid_scores_)
After I run the code and print the grid scores I get the following outcome:
[mean: -3.28593, std: 1.69134, params: {'gamma': 0.0001, 'C': 0.0001}, mean: -3.29370, std: 1.69346, params: {'gamma': 0.001, 'C': 0.0001}, mean: -3.28933, std: 1.69104, params: {'gamma': 0.01, 'C': 0.0001}, mean: -3.28925, std: 1.69106, params: {'gamma': 0.1, 'C': 0.0001}, mean: -3.28925, std: 1.69106, params: {'gamma': 1.0, 'C': 0.0001}, mean: -3.28925, std: 1.69106, params: {'gamma': 10.0, 'C': 0.0001},etc]
I would like to visualize all the scores (mean values) depending on gamma and C parameters. The graph I am trying to obtain should look as follows:
Where x-axis is gamma, y-axis is mean score (root mean square error in this case), and different lines represent different C values.
解决方案from sklearn.svm import SVC from sklearn.grid_search import GridSearchCV from sklearn import datasets import matplotlib.pyplot as plt import seaborn as sns import numpy as np digits = datasets.load_digits() X = digits.data y = digits.target clf_ = SVC(kernel='rbf') Cs = [1, 10, 100, 1000] Gammas = [1e-3, 1e-4] clf = GridSearchCV(clf_, dict(C=Cs, gamma=Gammas), cv=2, pre_dispatch='1*n_jobs', n_jobs=1) clf.fit(X, y) scores = [x[1] for x in clf.grid_scores_] scores = np.array(scores).reshape(len(Cs), len(Gammas)) for ind, i in enumerate(Cs): plt.plot(Gammas, scores[ind], label='C: ' + str(i)) plt.legend() plt.xlabel('Gamma') plt.ylabel('Mean score') plt.show()
- Code is based on this.
- Only puzzling part: will sklearn always respect the order of C & Gamma -> official example uses this "ordering"
Output:
这篇关于如何从 GridSearchCV 绘制网格分数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!