使用xgboost分类器进行多类分类? [英] Multiclass classification with xgboost classifier?

查看:2834
本文介绍了使用xgboost分类器进行多类分类?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用xgboost进行多类分类,并且已经使用此代码构建了它,

I am trying out multi-class classification with xgboost and I've built it using this code,

clf = xgb.XGBClassifier(max_depth=7, n_estimators=1000)

clf.fit(byte_train, y_train)
train1 = clf.predict_proba(train_data)
test1 = clf.predict_proba(test_data)

这给了我一些不错的结果.我的案例的对数损失低于0.7.但是在浏览了几页之后,我发现我们必须在XGBClassifier中使用另一个目标来解决多类问题.这是从这些页面推荐的内容.

This gave me some good results. I've got log-loss below 0.7 for my case. But after looking through few pages I've found that we have to use another objective in XGBClassifier for multi-class problem. Here's what is recommended from those pages.

clf = xgb.XGBClassifier(max_depth=5, objective='multi:softprob', n_estimators=1000, 
                        num_classes=9)

clf.fit(byte_train, y_train)  
train1 = clf.predict_proba(train_data)
test1 = clf.predict_proba(test_data)

此代码也可以正常工作,但是与我的第一个代码相比,它需要花费大量时间才能完成.

This code is also working but it's taking a lot of time to complete compared when to my first code.

为什么我的第一个代码也适用于多类情况?我已经检查过它的默认目标是binary:logistic用于二进制分类,但是它对于多类确实工作得很好?如果两者都正确,我应该使用哪一个?

Why is my first code also working for multi-class case? I have checked that it's default objective is binary:logistic used for binary classification but it worked really well for multi-class? Which one should I use if both are correct?

推荐答案

默认情况下,XGBClassifier使用objective='binary:logistic'.使用此目标时,它采用以下两种策略之一:one-vs-rest(也称为一对多")和one-vs-one.对于您遇到的问题,这可能不是正确的选择.

By default, XGBClassifier uses the objective='binary:logistic'. When you use this objective, it employs either of these strategies: one-vs-rest (also known as one-vs-all) and one-vs-one. It may not be the right choice for your problem at hand.

使用objective='multi:softprob'时,输出是数据点数量*类数的向量.结果,代码的时间复杂度增加了.

When you use objective='multi:softprob', the output is a vector of number of data points * number of classes. As a result, there is an increase in time complexity of your code.

尝试在代码中设置objective=multi:softmax.更适合于多类分类任务.

Try setting objective=multi:softmax in your code. It is more apt for multi-class classification task.

这篇关于使用xgboost分类器进行多类分类?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆