xgboost,抵消曝光? [英] xgboost, offset exposure?
问题描述
我正在 R 中建模声明频率(泊松分布).我使用的是 gbm
和 xgboost
包,但似乎 xgboost
没有将曝光考虑在内的偏移参数?
I am modelling a claims frequency (poisson distr) in R. I am using the gbm
and xgboost
packages, but it seems that xgboost
does not have an offset parameter to take the exposure into account?
在 gbm
中,人们会按如下方式考虑曝光:
In a gbm
, one would take the exposure into account as follows:
gbm.fit(x = train,y = target, n.trees = 100,distribution = "poisson", offset = log(exposure))
如何使用 `xgboost 实现相同的效果?
How do I achieve the same with `xgboost?
PS:我不能使用曝光作为预测因子,因为每次观察到声明时都会创建一个新的观察.
PS: I cannot use the exposure as predictor since a new obs is created each time a claim is observed.
推荐答案
创建 xgboost 矩阵后,您可以使用 setinfo 和 base_margin 属性设置偏移量,例如:
Once you have created your xgboost matrix you can set an offset using setinfo and the base_margin attribute, eg:
setinfo(xgtrain, "base_margin", log(d$exposure))
你可以从我在这里问的类似问题中看到完整的例子:XGBoost- 具有不同曝光/偏移量的泊松分布
You can see the full example from the similar question I asked here: XGBoost - Poisson distribution with varying exposure / offset
这篇关于xgboost,抵消曝光?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!