如何从保存的 XGBoost 模型中获取参数 [英] How to get the params from a saved XGBoost model

查看:55
本文介绍了如何从保存的 XGBoost 模型中获取参数的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用以下参数训练 XGBoost 模型:

I'm trying to train a XGBoost model using the params below:

xgb_params = {
    'objective': 'binary:logistic',
    'eval_metric': 'auc',
    'lambda': 0.8,
    'alpha': 0.4,
    'max_depth': 10,
    'max_delta_step': 1,
    'verbose': True
}

由于我的输入数据太大而无法完全加载到内存中,因此我调整了增量训练:

Since my input data is too big to be fully loaded into the memory, I adapt the incremental training:

xgb_clf = xgb.train(xgb_params, input_data, num_boost_round=rounds_per_batch,
                    xgb_model=model_path)

预测的代码是

xgb_clf = xgb.XGBClassifier()
booster = xgb.Booster()
booster.load_model(model_path)
xgb_clf._Booster = booster
raw_probas = xgb_clf.predict_proba(x)

结果看起来不错.但是当我尝试调用 xgb_clf.get_xgb_params() 时,我得到了一个 param dict,其中所有参数都设置为默认值.

The result seemed good. But when I tried to invoke xgb_clf.get_xgb_params(), I got a param dict in which all params were set to default values.

我可以猜到根本原因是当我初始化模型时,我没有传入任何参数.所以模型是使用默认值初始化的,但是当它预测时,它使用了一个内部助推器,该助推器已经使用一些预定义的参数.

I can guess that the root cause is when I initialized the model, I didn't pass any params in. So the model was initialized using the default values but when it predicted, it used an internal booster that had been fitted using some pre-defined params.

但是,我想知道有什么方法可以在我将预训练的 booster 模型分配给 XGBClassifier 后,看到用于训练 booster 的真实参数,但看不到用于初始化分类器的参数.

However, I wonder is there any way that, after I assign a pre-trained booster model to a XGBClassifier, I can see the real params that are used to train the booster, but not those which are used to initialize the classifier.

推荐答案

您似乎在代码中将 sklearn API 与函数式 API 混合使用,如果您坚持使用任何一个,您应该获得参数以保留在 pickle 中.这是一个使用 sklearn API 的示例.

You seem to be mixing the sklearn API with the functional API in your code, if you stick to either one you should get the parameters to persist in the pickle. Here's an example using the sklearn API.

import pickle
import numpy as np
import xgboost as xgb
from sklearn.datasets import load_digits


digits = load_digits(2)
y = digits['target']
X = digits['data']

xgb_params = {
    'objective': 'binary:logistic',
    'reg_lambda': 0.8,
    'reg_alpha': 0.4,
    'max_depth': 10,
    'max_delta_step': 1,
}
clf = xgb.XGBClassifier(**xgb_params)
clf.fit(X, y, eval_metric='auc', verbose=True)

pickle.dump(clf, open("xgb_temp.pkl", "wb"))
clf2 = pickle.load(open("xgb_temp.pkl", "rb"))

assert np.allclose(clf.predict(X), clf2.predict(X))
print(clf2.get_xgb_params())

产生

{'base_score': 0.5,
 'colsample_bylevel': 1,
 'colsample_bytree': 1,
 'gamma': 0,
 'learning_rate': 0.1,
 'max_delta_step': 1,
 'max_depth': 10,
 'min_child_weight': 1,
 'missing': nan,
 'n_estimators': 100,
 'objective': 'binary:logistic',
 'reg_alpha': 0.4,
 'reg_lambda': 0.8,
 'scale_pos_weight': 1,
 'seed': 0,
 'silent': 1,
 'subsample': 1}

这篇关于如何从保存的 XGBoost 模型中获取参数的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆