xgboost : base_score 参数的含义 [英] xgboost : The meaning of the base_score parameter

查看:218
本文介绍了xgboost : base_score 参数的含义的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我阅读的 xgboost 文档中:

In the documentation of xgboost I read:

base_score [default=0.5] :所有的初始预测分数实例,全局偏差

base_score [default=0.5] : the initial prediction score of all instances, global bias

这句话是什么意思?基础分数是数据集中感兴趣事件的先验概率吗?IE.在包含 1,000 个观察值、300 个正例和 700 个负例的数据集中,基本分数是 0.3?

What is the meaning of this phrase? Is the base score the prior probability of the Event of Interest in the Dataset? I.e. in a dataset of 1,000 observations with 300 Positives and 700 Negatives the base score would be 0.3?

如果不是,那会是什么?

If not, what it would be?

您的建议将不胜感激.

推荐答案

我认为您的理解是正确的,在您的示例中,基本分数可以设置为 0.3,或者您可以简单地将其保留为默认的 0.5.对于高度不平衡的数据,您可以将其初始化为更有意义的基础分数,以改进学习过程.理论上,只要选择合适的学习率并给它足够的训练步骤,起始基础分数应该不会影响结果.看看这个issue中作者的回答.

I think your understanding is correct, in your example the base score could be set to 0.3, or you can simply leave it to be the default 0.5. For highly imbalanced data you can initialize it to a more meaningful base score for an improved learning process. Theoretically, as long as you choose the right learning rate and give it enough steps to train, the starting base score shouldn't affect the result. Look at the author's answer in this issue.

参考:https://github.com/dmlc/xgboost/issues/799

这篇关于xgboost : base_score 参数的含义的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆