XGBoost 预测总是返回相同的值 - 为什么? [英] XGBoost prediction always returning the same value - why?

查看:91
本文介绍了XGBoost 预测总是返回相同的值 - 为什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我将 SageMaker 的内置 XGBoost 算法与以下训练和验证集一起使用:

但是当我添加第 11 个时,它失败了.这让我相信我必须用零索引训练模型而不是删除它们.所以我接下来会尝试.更新:包含零值的再培训似乎没有帮助.我每次仍然得到相同的值.我注意到我不能向预测端点发送超过 10 个值,否则它会返回一个错误:无法评估提供的有效负载".所以此时使用libsvm格式只会增加更多的问题.

解决方案

你有一些问题.

  1. 将 {"num_round", "50"} 与如此小的 ETA {"eta", "0.001"} 一起使用不会给您带来任何好处.
  2. {"max_depth", "1000"} 1000 太疯狂了!(默认值为 6)

建议:

 {"max_depth", "6"},{"eta", "0.05"},{"min_child_weight", "3"},{子样本",0.8"},{沉默",0"},{目标",注册:线性"},{"num_round", "200"}

试试这个并报告你的输出

I'm using SageMaker's built in XGBoost algorithm with the following training and validation sets:

https://files.fm/u/pm7n8zcm

When running the prediction model that comes out of the training with the above datasets always produces the exact same result.

Is there something obvious in the training or validation datasets that could explain this behavior?

Here is an example code snippet where I'm setting the Hyperparameters:

{
                    {"max_depth", "1000"},
                    {"eta", "0.001"},
                    {"min_child_weight", "10"},
                    {"subsample", "0.7"},
                    {"silent", "0"},
                    {"objective", "reg:linear"},
                    {"num_round", "50"}
                }

And here is the source code: https://github.com/paulfryer/continuous-training/blob/master/ContinuousTraining/StateMachine/Retrain.cs#L326

It's not clear to me what hyper parameters might need to be adjusted.

This screenshot shows that I'm getting a result with 8 indexes:

But when I add the 11th one, it fails. This leads me to believe that I have to train the model with zero indexes instead of removing them. So I'll try that next. Update: retraining with zero values included doesn't seem to help. I'm still getting the same value every time. I noticed i can't send more than 10 values to the prediction endpoint or it will return an error: "Unable to evaluate payload provided". So at this point using the libsvm format has only added more problems.

解决方案

You've got a few things wrong there.

  1. using {"num_round", "50"} with such a small ETA {"eta", "0.001"} will give you nothing.
  2. {"max_depth", "1000"} 1000 is insane! (default value is 6)

Suggesting:

    {"max_depth", "6"},
    {"eta", "0.05"},
    {"min_child_weight", "3"},
    {"subsample", "0.8"},
    {"silent", "0"},
    {"objective", "reg:linear"},
    {"num_round", "200"}

Try this and report your output

这篇关于XGBoost 预测总是返回相同的值 - 为什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆