在 Sklearn 管道中使用 VotingClassifier [英] Using VotingClassifier in Sklearn Pipeline

查看:65
本文介绍了在 Sklearn 管道中使用 VotingClassifier的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想将投票分类器应用于多个管道分类器,并在网格搜索中调整参数.下面的最小示例给我一个错误.我必须采取其他方式吗?

I want to apply a voting classifier to several pipeline classifiers and tune the parameters in a grid search. Following minimal example gives me an error. Do I have to do this differently?

from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import AdaBoostClassifier
from sklearn.ensemble import VotingClassifier
p1 = Pipeline([['clf1', RandomForestClassifier()]])
p2 = Pipeline([['clf2', AdaBoostClassifier()]])
p3 = Pipeline([['clf3', VotingClassifier(estimators=(p1, p2))]])
p3.get_params()

错误:

TypeError: cannot convert dictionary update sequence element #0 to a sequence

推荐答案

在为 VotingClassifier 指定估算器时,您需要给它们每个命名:

When you are specifying the estimators for VotingClassifier, you need to give each of them a name:

from sklearn.pipeline import Pipeline
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import AdaBoostClassifier
from sklearn.ensemble import VotingClassifier
p1 = Pipeline([['clf1', RandomForestClassifier()]])
p2 = Pipeline([['clf2', AdaBoostClassifier()]])
p3 = Pipeline([['clf3', VotingClassifier(estimators=[("p1",p1), ("p2",p2)])]])
p3.get_params()

这将输出:

{'clf3': VotingClassifier(estimators=[('p1', Pipeline(steps=[['clf1', RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
             max_depth=None, max_features='auto', max_leaf_nodes=None,
             min_impurity_split=1e-07, min_samples_leaf=1,
             min_samples_split=2, min_weight_fraction...SAMME.R', base_estimator=None,
           learning_rate=1.0, n_estimators=50, random_state=None)]]))],
          n_jobs=1, voting='hard', weights=None),
 'clf3__estimators': [('p1',
   Pipeline(steps=[['clf1', RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
               max_depth=None, max_features='auto', max_leaf_nodes=None,
               min_impurity_split=1e-07, min_samples_leaf=1,
               min_samples_split=2, min_weight_fraction_leaf=0.0,
               n_estimators=10, n_jobs=1, oob_score=False, random_state=None,
               verbose=0, warm_start=False)]])),
  ('p2',
   Pipeline(steps=[['clf2', AdaBoostClassifier(algorithm='SAMME.R', base_estimator=None,
             learning_rate=1.0, n_estimators=50, random_state=None)]]))],
 'clf3__n_jobs': 1,
 'clf3__p1': Pipeline(steps=[['clf1', RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
             max_depth=None, max_features='auto', max_leaf_nodes=None,
             min_impurity_split=1e-07, min_samples_leaf=1,
             min_samples_split=2, min_weight_fraction_leaf=0.0,
             n_estimators=10, n_jobs=1, oob_score=False, random_state=None,
             verbose=0, warm_start=False)]]),
 'clf3__p1__clf1': RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
             max_depth=None, max_features='auto', max_leaf_nodes=None,
             min_impurity_split=1e-07, min_samples_leaf=1,
             min_samples_split=2, min_weight_fraction_leaf=0.0,
             n_estimators=10, n_jobs=1, oob_score=False, random_state=None,
             verbose=0, warm_start=False),
 'clf3__p1__clf1__bootstrap': True,
 'clf3__p1__clf1__class_weight': None,
 'clf3__p1__clf1__criterion': 'gini',
 'clf3__p1__clf1__max_depth': None,
 'clf3__p1__clf1__max_features': 'auto',
 'clf3__p1__clf1__max_leaf_nodes': None,
 'clf3__p1__clf1__min_impurity_split': 1e-07,
 'clf3__p1__clf1__min_samples_leaf': 1,
 'clf3__p1__clf1__min_samples_split': 2,
 'clf3__p1__clf1__min_weight_fraction_leaf': 0.0,
 'clf3__p1__clf1__n_estimators': 10,
 'clf3__p1__clf1__n_jobs': 1,
 'clf3__p1__clf1__oob_score': False,
 'clf3__p1__clf1__random_state': None,
 'clf3__p1__clf1__verbose': 0,
 'clf3__p1__clf1__warm_start': False,
 'clf3__p1__steps': [['clf1',
   RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
               max_depth=None, max_features='auto', max_leaf_nodes=None,
               min_impurity_split=1e-07, min_samples_leaf=1,
               min_samples_split=2, min_weight_fraction_leaf=0.0,
               n_estimators=10, n_jobs=1, oob_score=False, random_state=None,
               verbose=0, warm_start=False)]],
 'clf3__p2': Pipeline(steps=[['clf2', AdaBoostClassifier(algorithm='SAMME.R', base_estimator=None,
           learning_rate=1.0, n_estimators=50, random_state=None)]]),
 'clf3__p2__clf2': AdaBoostClassifier(algorithm='SAMME.R', base_estimator=None,
           learning_rate=1.0, n_estimators=50, random_state=None),
 'clf3__p2__clf2__algorithm': 'SAMME.R',
 'clf3__p2__clf2__base_estimator': None,
 'clf3__p2__clf2__learning_rate': 1.0,
 'clf3__p2__clf2__n_estimators': 50,
 'clf3__p2__clf2__random_state': None,
 'clf3__p2__steps': [['clf2',
   AdaBoostClassifier(algorithm='SAMME.R', base_estimator=None,
             learning_rate=1.0, n_estimators=50, random_state=None)]],
 'clf3__voting': 'hard',
 'clf3__weights': None,
 'steps': [['clf3',
   VotingClassifier(estimators=[('p1', Pipeline(steps=[['clf1', RandomForestClassifier(bootstrap=True, class_weight=None, criterion='gini',
               max_depth=None, max_features='auto', max_leaf_nodes=None,
               min_impurity_split=1e-07, min_samples_leaf=1,
               min_samples_split=2, min_weight_fraction...SAMME.R', base_estimator=None,
             learning_rate=1.0, n_estimators=50, random_state=None)]]))],
            n_jobs=1, voting='hard', weights=None)]]}

这篇关于在 Sklearn 管道中使用 VotingClassifier的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆