如何用GridSearchCV绘制网格分数? [英] How to graph grid scores from GridSearchCV?
问题描述
我正在寻找一种从sklearn中的GridSearchCV绘制grid_scores_图表的方法.在此示例中,我尝试对SVR算法的最佳伽玛和C参数进行网格搜索.我的代码如下:
I am looking for a way to graph grid_scores_ from GridSearchCV in sklearn. In this example I am trying to grid search for best gamma and C parameters for an SVR algorithm. My code looks as follows:
C_range = 10.0 ** np.arange(-4, 4)
gamma_range = 10.0 ** np.arange(-4, 4)
param_grid = dict(gamma=gamma_range.tolist(), C=C_range.tolist())
grid = GridSearchCV(SVR(kernel='rbf', gamma=0.1),param_grid, cv=5)
grid.fit(X_train,y_train)
print(grid.grid_scores_)
运行代码并打印网格分数后,我得到以下结果:
After I run the code and print the grid scores I get the following outcome:
[mean: -3.28593, std: 1.69134, params: {'gamma': 0.0001, 'C': 0.0001}, mean: -3.29370, std: 1.69346, params: {'gamma': 0.001, 'C': 0.0001}, mean: -3.28933, std: 1.69104, params: {'gamma': 0.01, 'C': 0.0001}, mean: -3.28925, std: 1.69106, params: {'gamma': 0.1, 'C': 0.0001}, mean: -3.28925, std: 1.69106, params: {'gamma': 1.0, 'C': 0.0001}, mean: -3.28925, std: 1.69106, params: {'gamma': 10.0, 'C': 0.0001},etc]
我想可视化所有取决于γ和C参数的分数(平均值).我尝试获取的图形应如下所示:
I would like to visualize all the scores (mean values) depending on gamma and C parameters. The graph I am trying to obtain should look as follows:
其中x轴是伽玛,y轴是均值(在这种情况下,均方根误差),并且不同的线表示不同的C值.
Where x-axis is gamma, y-axis is mean score (root mean square error in this case), and different lines represent different C values.
推荐答案
from sklearn.svm import SVC
from sklearn.grid_search import GridSearchCV
from sklearn import datasets
import matplotlib.pyplot as plt
import seaborn as sns
import numpy as np
digits = datasets.load_digits()
X = digits.data
y = digits.target
clf_ = SVC(kernel='rbf')
Cs = [1, 10, 100, 1000]
Gammas = [1e-3, 1e-4]
clf = GridSearchCV(clf_,
dict(C=Cs,
gamma=Gammas),
cv=2,
pre_dispatch='1*n_jobs',
n_jobs=1)
clf.fit(X, y)
scores = [x[1] for x in clf.grid_scores_]
scores = np.array(scores).reshape(len(Cs), len(Gammas))
for ind, i in enumerate(Cs):
plt.plot(Gammas, scores[ind], label='C: ' + str(i))
plt.legend()
plt.xlabel('Gamma')
plt.ylabel('Mean score')
plt.show()
- 代码基于此.
- 仅令人困惑的部分:sklearn将始终尊重C&的顺序. Gamma->官方示例使用此排序"
- Code is based on this.
- Only puzzling part: will sklearn always respect the order of C & Gamma -> official example uses this "ordering"
输出:
这篇关于如何用GridSearchCV绘制网格分数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!