在某些情况下,Python中的xgb.train和xgb.XGBRegressor的值是不同的 [英] Difference is value between xgb.train and xgb.XGBRegressor in Python for certain cases

查看:46
本文介绍了在某些情况下,Python中的xgb.train和xgb.XGBRegressor的值是不同的的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我注意到在Python中XGBoost有两种可能的实现,如

I noticed that there are two possible implementations of XGBoost in Python as discussed here and here

当我尝试通过两种可能的实现方式运行相同的数据集时,我注意到结果是不同的.

When I tried running the same dataset through the two possible implementations I noticed that the results were different.

代码

import xgboost as xgb
from xgboost.sklearn import XGBRegressor
import xgboost
import pandas as pd
import numpy as np
from sklearn import datasets

boston_data = datasets.load_boston()
df = pd.DataFrame(boston_data.data,columns=boston_data.feature_names)
df['target'] = pd.Series(boston_data.target)

Y = df["target"]
X = df.drop('target', axis=1)

#### Code using Native Impl for XGBoost
dtrain = xgboost.DMatrix(X, label=Y, missing=0.0)
params = {'max_depth': 3, 'learning_rate': .05, 'min_child_weight' : 4, 'subsample' : 0.8}
evallist = [(dtrain, 'eval'), (dtrain, 'train')]

model = xgboost.train(dtrain=dtrain, params=params,num_boost_round=200)

predictions = model.predict(dtrain)

#### Code using Sklearn Wrapper for XGBoost
model = XGBRegressor(n_estimators = 200, max_depth=3, learning_rate =.05, min_child_weight=4, subsample=0.8 )

#model = model.fit(X, Y, eval_set = [(X, Y), (X, Y)], eval_metric = 'rmse', verbose=True)
model = model.fit(X, Y)

predictions2 = model.predict(X)

print(np.absolute(predictions-predictions2).sum())

使用sklearn波士顿数据集的绝对差总和

62.687134

当我对其他数据集(如sklearn糖尿病数据集)进行相同处理时,我发现差异很小.

When I ran the same for other datasets like the sklearn diabetes dataset I observed that the difference was much smaller.

使用 sklearn 糖尿病数据集的绝对差和

0.0011711121

推荐答案

我尚未为sklearn实现设置缺少"参数.设置完成后,值将匹配.

I've not set the "missing" parameter for the sklearn implementation. Once that was set the values were matching.

正如Noah所指出的那样,sklearn包装器具有一些不同的默认值,需要对其进行匹配才能完全匹配结果.

Also as Noah pointed out, sklearn wrapper has a few different default values which needs to be matched in order to exactly match the results.

这篇关于在某些情况下,Python中的xgb.train和xgb.XGBRegressor的值是不同的的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆