Scikit-结合比例尺和网格搜索 [英] Scikit - Combining scale and grid search

查看：101 发布时间：2020/10/11 19:56:10 python scikit-learn cross-validation grid-search

本文介绍了Scikit-结合比例尺和网格搜索的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

我是scikit的新手，结合数据缩放和网格搜索有2个小问题。

I am new to scikit, and have 2 slight issues to combine a data scale and grid search.

高效缩放器

考虑使用Kfolds进行交叉验证，我希望每次我们在K-1折叠上训练模型时，数据缩放器（使用预处理例如。StandardScaler（）仅适合K-1折叠，然后应用于其余折叠。

Considering a cross validation using Kfolds, I would like that each time we train the model on the K-1 folds, the data scaler (using preprocessing.StandardScaler() for instance) is fit only on the K-1 folds and then apply to the remaining fold.

我的印象是，以下代码将适合整个数据集上的缩放器，因此我想将其修改为先前描述的行为：

My impression is that the following code, will fit the scaler on the entire dataset, and therefore I would like to modify it to behave as described previsouly:

classifier = svm.SVC(C=1)    
clf = make_pipeline(preprocessing.StandardScaler(), classifier)
tuned_parameters = [{'C': [1, 10, 100, 1000]}]
my_grid_search = GridSearchCV(clf, tuned_parameters, cv=5)

检索内部缩放比例拟合

当refit = True时，在网格搜索之后，模型为在整个数据集上重新拟合（使用最佳估计器），我的理解是将再次使用管道，因此缩放器将适合整个数据集。理想情况下，我想重用适合我的测试数据集的规模。有没有办法直接从GridSearchCV检索它？

When refit=True, "after" the Grid Search, the model is refit (using the best estimator) on the entire dataset, my understanding is that the pipeline will be used again, and therefore the scaler will be fit on the entire dataset. Ideally I would like to reuse that fit to scale my 'test' dataset. Is there a way to retrieve it directly from the GridSearchCV?

Scikit-结合比例尺和网格搜索 [英] Scikit - Combining scale and grid search

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录关闭

Scikit-结合比例尺和网格搜索 [英] Scikit - Combining scale and grid search

问题描述

推荐答案

相关文章

Python最新文章

热门教程

热门工具

登录 关闭

登录关闭