sklearn - 如何从传递给 GridSearchCV 的管道内部检索 PCA 组件和解释方差 [英] sklearn - How to retrieve PCA components and explained variance from inside a Pipeline passed to GridSearchCV
问题描述
我将 GridSearchCV 与管道一起使用,如下所示:
I am using GridSearchCV with a pipeline as follows:
grid = GridSearchCV(
Pipeline([
('reduce_dim', PCA()),
('classify', RandomForestClassifier(n_jobs = -1))
]),
param_grid=[
{
'reduce_dim__n_components': range(0.7,0.9,0.1),
'classify__n_estimators': range(10,50,5),
'classify__max_features': ['auto', 0.2],
'classify__min_samples_leaf': [40,50,60],
'classify__criterion': ['gini', 'entropy']
}
],
cv=5, scoring='f1')
grid.fit(X,y)
我现在如何从 grid.best_estimator_
模型中检索 PCA 详细信息,例如 components
和 explained_variance
?
How do I now retrieve PCA details like components
and explained_variance
from the grid.best_estimator_
model?
此外,我还想使用 pickle 将 best_estimator_
保存到一个文件中,然后再加载它.如何从此加载的估算器中检索 PCA 详细信息?我怀疑它会和上面一样.
Furthermore, I also want to save the best_estimator_
to a file using pickle and later load it. How do I retrieve the PCA details from this loaded estimator? I suspect it will be the same as above.
推荐答案
grid.best_estimator_
是访问具有最佳参数的管道.
grid.best_estimator_
is to access the pipeline with the best parameters.
现在使用 named_steps[]
attribute 访问管道的内部估算器.
Now use named_steps[]
attribute to access the internal estimators of the pipeline.
所以 grid.best_estimator_.named_steps['reduce_dim']
会给你 pca
对象.现在您可以简单地使用它来访问此 pca 对象的 components_
和 explained_variance_
属性,如下所示:
So grid.best_estimator_.named_steps['reduce_dim']
will give you the pca
object. Now you can simply use this to access the components_
and explained_variance_
attibutes for this pca object like this:
grid.best_estimator_.named_steps['reduce_dim'].components_
grid.best_estimator_.named_steps['reduce_dim'].explained_variance_
这篇关于sklearn - 如何从传递给 GridSearchCV 的管道内部检索 PCA 组件和解释方差的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!