如何在Scikit-Learn(sklearn)中将GridSearchCV中的log_loss与多类标签一起使用? [英] How to use `log_loss` in `GridSearchCV` with multi-class labels in Scikit-Learn (sklearn)?

查看:117
本文介绍了如何在Scikit-Learn(sklearn)中将GridSearchCV中的log_loss与多类标签一起使用?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用GridSearchCVscoring参数中的log_loss参数来调整此多类(6个类)分类器.我不知道如何给它一个label参数.即使我给它sklearn.metrics.log_loss,交叉验证中的每次迭代也会改变它,所以我不知道如何给它赋予labels参数?

I'm trying to use the log_loss argument in the scoring parameter of GridSearchCV to tune this multi-class (6 classes) classifier. I don't understand how to give it a label parameter. Even if I gave it sklearn.metrics.log_loss, it would change for each iteration in the cross-validation so I don't understand how to give it the labels parameter?

我正在使用Python v3.6Scikit-Learn v0.18.1

如何将GridSearchCVlog_loss一起使用以进行多类模型调整?

How can I use GridSearchCV with log_loss with multi-class model tuning?

我的班级代表:

1    31
2    18
3    28
4    19
5    17
6    22
Name: encoding, dtype: int64

我的代码:

param_test = {"criterion": ["friedman_mse", "mse", "mae"]}
gsearch_gbc = GridSearchCV(estimator = GradientBoostingClassifier(n_estimators=10), 
                        param_grid = param_test, scoring="log_loss", n_jobs=1, iid=False, cv=cv_indices)
gsearch_gbc.fit(df_attr, Se_targets)

这是错误的结尾,完整的错误在这里 https://pastebin.com/1CshpEBN:

Here's the tail end of the error and the full one is here https://pastebin.com/1CshpEBN:

ValueError: y_true contains only one label (1). Please provide the true labels explicitly through the labels argument.

更新: 只需使用此功能即可根据基于@Grr

UPDATE: Just use this to make the scorer based on based on @Grr

log_loss_build = lambda y: metrics.make_scorer(metrics.log_loss, greater_is_better=False, needs_proba=True, labels=sorted(np.unique(y)))

推荐答案

我的假设是,您的数据拆分某种程度上在y_true中只有一个类标签.虽然根据您发布的发布情况似乎不太可能,但我想这是可能的.虽然我还没有遇到过这个问题,但似乎在[sklearn.metrics.log_loss](

my assumption is that somehow your data split has only one class label in y_true. while this seems unlikely based on the distribution you posted, i guess it is possible. While i havent run into this before it seems that in [sklearn.metrics.log_loss](http://scikit-learn.org/stable/modules/generated/sklearn.metrics.log_loss.html) the label argument is expected if the labels are all the same. The wording of this section of the documentation also makes it seem as if the method imputes a binary classification if labels is not passed.

现在,您正确地假设您应该将log_loss传递为scorer=sklearn.metrics.log_loss(labels=your_labels)

Now as you correctly assume you should pass log_loss as scorer=sklearn.metrics.log_loss(labels=your_labels)

这篇关于如何在Scikit-Learn(sklearn)中将GridSearchCV中的log_loss与多类标签一起使用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆