Sklearn投票合奏,具有使用不同功能的模型,并通过k倍交叉验证进行测试 [英] Sklearn Voting ensemble with models using different features and testing with k-fold cross validation

查看:182
本文介绍了Sklearn投票合奏,具有使用不同功能的模型,并通过k倍交叉验证进行测试的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含4组不同功能的数据框.

I have a data frame with 4 different groups of features.

我需要使用这四个不同的功能组创建4个不同的模型,并将它们与整体投票分类器结合起来. 此外,我需要使用k倍交叉验证来测试分类器.

I need to create 4 different models with these four different feature groups and combine them with the ensemble voting classifier. Furthermore, I need to test the classifier using k-fold cross validation.

但是,我发现很难将不同的功能集,投票分类器和k倍交叉验证与sklearn中的功能结合起来.以下是我到目前为止的代码.

However, I am finding it difficult to combine different feature sets, voting classifier and k-fold cross validation with functionality available in sklearn. Following is the code that I have so far.

y = df1.index
x = preprocessing.scale(df1)

SVM = svm.SVC(kernel='rbf', C=1)
rf=RandomForestClassifier(n_estimators=200)
ann = MLPClassifier(solver='lbfgs', alpha=1e-5, hidden_layer_sizes=(25, 2), random_state=1)
neigh = KNeighborsClassifier(n_neighbors=10)

models = list()
models.append(('facial', SVM))
models.append(('posture', rf))
models.append(('computer', ann))
models.append(('physio', neigh))

ens = VotingClassifier(estimators=models)

cv = KFold(n_splits=10, random_state=None, shuffle=True)
scores = cross_val_score(ens, x, y, cv=cv, scoring='accuracy')

如您所见,该程序对所有4种型号都使用相同的功能.我如何改进该程序以实现自己的目标?

As you can see, this program uses same features for all 4 models. How can I improve this program to achieve my objective?

推荐答案

我确实设法通过管道实现了这一目标,

I did manage to achieve this using Pipelines,

y = df1.index
x = preprocessing.scale(df1)

phy_features = ['A', 'B', 'C']
phy_transformer = Pipeline(steps=[('imputer', SimpleImputer(strategy='median')), ('scaler', StandardScaler())])
phy_processer = ColumnTransformer(transformers=[('phy', phy_transformer, phy_features)])

fa_features = ['D', 'E', 'F']
fa_transformer = Pipeline(steps=[('imputer', SimpleImputer(strategy='median')), ('scaler', StandardScaler())])
fa_processer = ColumnTransformer(transformers=[('fa', fa_transformer, fa_features)])


pipe_phy = Pipeline(steps=[('preprocessor', phy_processer ),('classifier', SVM)])
pipe_fa = Pipeline(steps=[('preprocessor', fa_processer ),('classifier', SVM)])

ens = VotingClassifier(estimators=[pipe_phy, pipe_fa])

cv = KFold(n_splits=10, random_state=None, shuffle=True)
for train_index, test_index in cv.split(x):
    x_train, x_test = x[train_index], x[test_index]
    y_train, y_test = y[train_index], y[test_index]
    ens.fit(x_train,y_train)
    print(ens.score(x_test, y_test))

请参考 sklearn管道:的参数如果您在使用ColumnTransforms时收到TypeError,则类型'ColumnTransformer'不可迭代.

这篇关于Sklearn投票合奏,具有使用不同功能的模型,并通过k倍交叉验证进行测试的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆