Scikit学习中的GridSearchCV输出问题 [英] GridSearchCV output problems in Scikit-learn

查看:76
本文介绍了Scikit学习中的GridSearchCV输出问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想执行超参数搜索以选择sklearn中的预处理步骤和模型,如下所示:

I'd like to perform a hyperparameter search for selecting preprocessing steps and models in sklearn as follows:

pipeline = Pipeline([("combiner", PolynomialFeatures()),
                     ("dimred", PCA()),
                     ("classifier", RandomForestClassifier())])

parameters = [{"combiner": [None]},
              {"combiner": [PolynomialFeatures()], "combiner__degree": [2], "combiner__interaction_only": [False, True]},

              {"dimred": [None]},
              {"dimred": [PCA()], "dimred__n_components": [.95, .75]},

              {"classifier": [RandomForestClassifier(n_estimators=100, class_weight="balanced")],
               "classifier__max_depth": [5, 10, None]},
              {"classifier": [KNeighborsClassifier(weights="distance")],
               "classifier__n_neighbors": [3, 7, 11]}]

CV = GridSearchCV(pipeline, parameters, cv=5, scoring="f1_weighted", refit=True, n_jobs=-1)
CV.fit(train_X, train_y)

当然,我需要具有最佳管道和最佳参数的结果.但是,当我请求 CV.best_estimator _ 的最佳估计量时,我只会获得获胜成分,而无法获得超参数:

Of course, I need the results with the best pipeline with the best parameters. However, when I request best estimators with CV.best_estimator_ I get only the winning components, not the hyperparameters:

Pipeline(steps=[('combiner', None), ('dimred', PCA()),
                ('classifier', RandomForestClassifier())])

当我打印出 CV.best_params _ 时,我得到的信息甚至更短(仅使用 Pipeline 的第一个元素,即 combiner ,没有有关 dimred classifier 的任何信息):

When I print out the CV.best_params_, I get an even shorter info (only with the first element of the Pipeline, the combiner, no info about dimred, classifier whatsoever):

{'combiner': None}

我如何才能将组件及其超参数与管道进行最佳组合?

How could I get the best pipeline combination with components and their hyperparameters?

推荐答案

管道对象具有

Pipeline objects have a get_params() method which returns the parameters of the pipeline. This includes the parameters of the individual steps as well. Based on your example, the command

CV.best_estimator_.get_params()

将检索最佳估算器的所有管道参数,包括您正在寻找的参数.

will retrieve all pipeline parameters of the best estimator, including those you are looking for.

这篇关于Scikit学习中的GridSearchCV输出问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆