GridSearchCV(sklearn) 中的多个估计器 [英] More than one estimator in GridSearchCV(sklearn)
问题描述
我正在查看有关 GridSearchCV
的 sklearn 文档网页.GridSearchCV
对象的属性之一是 best_estimator_
.所以这是我的问题.如何将多个估计量传递给 GSCV 对象?
I was checking sklearn documentation webpage about GridSearchCV
.
One of attributes of GridSearchCV
object is best_estimator_
.
So here is my question. How to pass more than one estimator to GSCV object?
使用如下字典:{'SVC()':{'C':10, 'gamma':0.01}, 'DecTreeClass()':{....}}
?
推荐答案
GridSearchCV 适用于参数.它将使用 param_grid
中指定的不同参数组合训练多个估计器(但相同的类(SVC 或 DecisionTreeClassifier 或其他分类器之一).best_estimator_
是执行最好的数据.
GridSearchCV works on parameters. It will train multiple estimators (but same class (one of SVC, or DecisionTreeClassifier, or other classifiers) with different parameter combinations from specified in param_grid
. best_estimator_
is the estimator which performs best on the data.
因此本质上 best_estimator_
是使用最佳找到的参数初始化的同一个类对象.
So essentially best_estimator_
is the same class object initialized with best found params.
因此在基本设置中,您不能在网格搜索中使用多个估计器.
So in the basic setup you cannot use multiple estimators in the grid-search.
但作为一种解决方法,当使用估计器是 GridSearchCV 可以设置的 "parameter"
的管道时,您可以有多个估计器.
But as a workaround, you can have multiple estimators when using a pipeline in which the estimator is a "parameter"
which the GridSearchCV can set.
像这样:
from sklearn.pipeline import Pipeline
from sklearn.svm import SVC
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import GridSearchCV
from sklearn.datasets import load_iris
iris_data = load_iris()
X, y = iris_data.data, iris_data.target
# Just initialize the pipeline with any estimator you like
pipe = Pipeline(steps=[('estimator', SVC())])
# Add a dict of estimator and estimator related parameters in this list
params_grid = [{
'estimator':[SVC()],
'estimator__C': [1, 10, 100, 1000],
'estimator__gamma': [0.001, 0.0001],
},
{
'estimator': [DecisionTreeClassifier()],
'estimator__max_depth': [1,2,3,4,5],
'estimator__max_features': [None, "auto", "sqrt", "log2"],
},
# {'estimator':[Any_other_estimator_you_want],
# 'estimator__valid_param_of_your_estimator':[valid_values]
]
grid = GridSearchCV(pipe, params_grid)
您可以根据需要在 params_grid
列表中添加任意数量的 dict,但请确保每个 dict 都具有与 'estimator'
相关的兼容参数.
You can add as many dicts inside the list of params_grid
as you like, but make sure that each dict have compatible parameters related to the 'estimator'
.
这篇关于GridSearchCV(sklearn) 中的多个估计器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!