使用嵌套在GridSearchCV中的RFECV时,如何避免使用estimator_params? [英] How can I avoid using estimator_params when using RFECV nested within GridSearchCV?
问题描述
我目前正在使用scikit-learn在基于树的方法的网格搜索(GridSearchCV)中研究递归特征消除(RFECV).为此,我使用的是GitHub(0.17)上的当前开发版本,该版本允许RFECV使用树方法中的功能重要性来选择要丢弃的功能.
I'm currently working on recursive feature elimination (RFECV) within a grid search (GridSearchCV) for tree based methods using scikit-learn. To do this, I'm using the current dev version on GitHub (0.17) which allows RFECV to use feature importance from the tree methods to select features to discard.
为清楚起见,这意味着:
For clarity this means:
- 在超参数上循环使用当前树方法
- 对每组参数执行递归特征消除以获得最佳特征数
- 报告分数"(例如准确性)
- 确定哪一组参数产生了最高分
此代码目前可以正常工作-但我收到有关使用estimator_params的折旧警告.这是当前代码:
This code is working fine at the moment - but I'm getting a depreciation warning about using estimator_params. Here is the current code:
# set up list of parameter dictionaries (better way to do this?)
depth = [1, 5, None]
weight = ['balanced', None]
params = []
for d in depth:
for w in weight:
params.append(dict(max_depth=d,
class_weight=w))
# specify the classifier
estimator = DecisionTreeClassifier(random_state=0,
max_depth=None,
class_weight='balanced')
# specify the feature selection method
selector = RFECV(estimator,
step=1,
cv=3,
scoring='accuracy')
# set up the parameter search
clf = GridSearchCV(selector,
{'estimator_params': param_grid},
cv=3)
clf.fit(X_train, y_train)
clf.best_estimator_.estimator_
这是完整的折旧警告:
home/csw34/git/scikit-learn/sklearn/feature_selection/rfe.py:154: DeprecationWarning:
The parameter 'estimator_params' is deprecated as of version 0.16 and will be removed in 0.18. The parameter is no longer necessary because the value is set via the estimator initialisation or set_params method.
在不使用GridSearchCV中的estimator_params将参数通过RFECV传递给估计器的情况下,如何能够获得相同的结果?
How I would be able to achieve the same result without using estimator_params in GridSearchCV to pass the parameters through RFECV to the estimator?
推荐答案
这解决了您的问题:
params = {'estimator__max_depth': [1, 5, None],
'estimator__class_weight': ['balanced', None]}
estimator = DecisionTreeClassifier()
selector = RFECV(estimator, step=1, cv=3, scoring='accuracy')
clf = GridSearchCV(selector, params, cv=3)
clf.fit(X_train, y_train)
clf.best_estimator_.estimator_
要查看更多信息,请使用:
To see more, use:
print(selector.get_params())
这篇关于使用嵌套在GridSearchCV中的RFECV时,如何避免使用estimator_params?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!