sklearn Pipeline:"ColumnTransformer"类型的参数不可迭代 [英] sklearn Pipeline: argument of type 'ColumnTransformer' is not iterable

查看:369
本文介绍了sklearn Pipeline:"ColumnTransformer"类型的参数不可迭代的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

由于我希望集成学习者使用在不同功能集上训练的模型,因此我尝试使用管道来提供集成投票分类器.为此,我遵循了 [1] .

I am attempting to use a pipeline to feed an ensemble voting classifier as I want the ensemble learner to use models that train on different feature sets. For this purpose, I followed the tutorial available at [1].

以下是我到目前为止可以开发的代码.

Following is the code that I could develop so far.

y = df1.index
x = preprocessing.scale(df1)

phy_features = ['A', 'B', 'C']
phy_transformer = Pipeline(steps=[('imputer', SimpleImputer(strategy='median')), ('scaler', StandardScaler())])
phy_processer = ColumnTransformer(transformers=[('phy', phy_transformer, phy_features)])

fa_features = ['D', 'E', 'F']
fa_transformer = Pipeline(steps=[('imputer', SimpleImputer(strategy='median')), ('scaler', StandardScaler())])
fa_processer = ColumnTransformer(transformers=[('fa', fa_transformer, fa_features)])


pipe_phy = Pipeline(steps=[('preprocessor', phy_processer ),('classifier', SVM)])
pipe_fa = Pipeline(steps=[('preprocessor', fa_processer ),('classifier', SVM)])

ens = VotingClassifier(estimators=[pipe_phy, pipe_fa])

cv = KFold(n_splits=10, random_state=None, shuffle=True)
for train_index, test_index in cv.split(x):
    x_train, x_test = x[train_index], x[test_index]
    y_train, y_test = y[train_index], y[test_index]
    ens.fit(x_train,y_train)
    print(ens.score(x_test, y_test))

但是,在运行代码时,在行ens.fit(x_train,y_train)处出现错误消息TypeError: argument of type 'ColumnTransformer' is not iterable.

However, when running the code, I am getting an error saying TypeError: argument of type 'ColumnTransformer' is not iterable, at the line ens.fit(x_train,y_train).

以下是我收到的完整堆栈跟踪.

Following is the complete stack trace that I am receiving.

Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File "C:\Program Files\JetBrains\PyCharm 2020.1.1\plugins\python\helpers\pydev\_pydev_bundle\pydev_umd.py", line 197, in runfile
    pydev_imports.execfile(filename, global_vars, local_vars)  # execute the script
  File "C:\Program Files\JetBrains\PyCharm 2020.1.1\plugins\python\helpers\pydev\_pydev_imps\_pydev_execfile.py", line 18, in execfile
    exec(compile(contents+"\n", file, 'exec'), glob, loc)
  File "C:/Users/ASUS/PycharmProjects/swelltest/enemble.py", line 112, in <module>
    ens.fit(x_train,y_train)
  File "C:\Users\ASUS\PycharmProjects\swelltest\venv\lib\site-packages\sklearn\ensemble\_voting.py", line 265, in fit
    return super().fit(X, transformed_y, sample_weight)
  File "C:\Users\ASUS\PycharmProjects\swelltest\venv\lib\site-packages\sklearn\ensemble\_voting.py", line 65, in fit
    names, clfs = self._validate_estimators()
  File "C:\Users\ASUS\PycharmProjects\swelltest\venv\lib\site-packages\sklearn\ensemble\_base.py", line 228, in _validate_estimators
    self._validate_names(names)
  File "C:\Users\ASUS\PycharmProjects\swelltest\venv\lib\site-packages\sklearn\utils\metaestimators.py", line 77, in _validate_names
    invalid_names = [name for name in names if '__' in name]
  File "C:\Users\ASUS\PycharmProjects\swelltest\venv\lib\site-packages\sklearn\utils\metaestimators.py", line 77, in <listcomp>
    invalid_names = [name for name in names if '__' in name]
TypeError: argument of type 'ColumnTransformer' is not iterable

以下是发生错误时名称列表中的值.

Following are the values in the names list when the error is occuring.

1- ColumnTransformer(transformers=[('phy',
                                 Pipeline(steps=[('imputer',
                                                  SimpleImputer(strategy='median')),
                                                 ('scaler', StandardScaler())]),
                                 ['HR', 'RMSSD', 'SCL'])])
2- ColumnTransformer(transformers=[('fa',
                                 Pipeline(steps=[('imputer',
                                                  SimpleImputer(strategy='median')),
                                                 ('scaler', StandardScaler())]),
                                 ['Squality', 'Sneutral', 'Shappy'])])

这是什么原因,我该如何解决?

What is the reason for this and how can I fix it?

推荐答案

VotingClassifierestimators参数应该是对列表(名称,估计量),例如

The estimators parameter of VotingClassifier should be a list of pairs (name, estimator), so e.g.

ens = VotingClassifier(estimators=[('phy', pipe_phy),
                                   ('fa', pipe_fa)])

(在您的代码中,检查试图查找该对中的第二个元素,因此抱怨ColumnTransformer不可迭代.)

(In your code, the check is trying to find the second element of the pair, hence the complaint that ColumnTransformer is not iterable.)

这篇关于sklearn Pipeline:"ColumnTransformer"类型的参数不可迭代的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆