XGBoost预测总是返回相同的值-为什么? [英] XGBoost prediction always returning the same value - why?

查看:1216
本文介绍了XGBoost预测总是返回相同的值-为什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用带有以下训练和验证集的SageMaker内置的XGBoost算法:

I'm using SageMaker's built in XGBoost algorithm with the following training and validation sets:

https://files.fm/u/pm7n8zcm

使用上述数据集运行训练产生的预测模型时,始终会产生完全相同的结果.

When running the prediction model that comes out of the training with the above datasets always produces the exact same result.

在训练或验证数据集中是否有明显的东西可以解释这种现象?

Is there something obvious in the training or validation datasets that could explain this behavior?

这是示例代码片段,我在其中设置了超参数:

Here is an example code snippet where I'm setting the Hyperparameters:

{
                    {"max_depth", "1000"},
                    {"eta", "0.001"},
                    {"min_child_weight", "10"},
                    {"subsample", "0.7"},
                    {"silent", "0"},
                    {"objective", "reg:linear"},
                    {"num_round", "50"}
                }

这是源代码: https://github.com/paulfryer/continuous-training/blob/master/ContinuousTraining/StateMachine/Retrain.cs#L326

我尚不清楚可能需要调整哪些超级参数.

It's not clear to me what hyper parameters might need to be adjusted.

此屏幕快照显示我得到了8个索引的结果:

This screenshot shows that I'm getting a result with 8 indexes:

但是当我添加第11个时,它失败了.这使我相信,我必须使用零索引来训练模型,而不是删除它们.因此,我接下来将尝试. 更新:包含零值的重新训练似乎无济于事.每次我仍然获得相同的价值.我注意到我无法向预测端点发送超过10个值,否则它将返回错误:无法评估提供的有效负载".因此,此时使用libsvm格式只会增加更多问题.

But when I add the 11th one, it fails. This leads me to believe that I have to train the model with zero indexes instead of removing them. So I'll try that next. Update: retraining with zero values included doesn't seem to help. I'm still getting the same value every time. I noticed i can't send more than 10 values to the prediction endpoint or it will return an error: "Unable to evaluate payload provided". So at this point using the libsvm format has only added more problems.

推荐答案

您在这里出错了.

  1. 将{"num_round","50"}与如此小的ETA {"eta","0.001"}结合使用将不会给您带来任何好处.
  2. {"max_depth","1000"} 1000太疯狂了! (默认值为6)
  1. using {"num_round", "50"} with such a small ETA {"eta", "0.001"} will give you nothing.
  2. {"max_depth", "1000"} 1000 is insane! (default value is 6)

建议:

    {"max_depth", "6"},
    {"eta", "0.05"},
    {"min_child_weight", "3"},
    {"subsample", "0.8"},
    {"silent", "0"},
    {"objective", "reg:linear"},
    {"num_round", "200"}

尝试一下并报告您的输出

Try this and report your output

这篇关于XGBoost预测总是返回相同的值-为什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆