如何在 XGBoost 中使用自定义交叉验证折叠 [英] How to use custom cross validation folds with XGBoost
问题描述
我正在为 XGBoost 使用 R 包装器.在函数xgb.cv中,有一个带有描述的folds
参数
I'm using the R wrapper for XGBoost. In the function xgb.cv, there is a folds
parameter with the description
list 提供了使用预定义 CV 折叠列表的可能性(每个元素必须是折叠索引的向量).如果折叠是提供,nfold 和分层参数将被忽略.
list provides a possibility of using a list of pre-defined CV folds (each element must be a vector of fold's indices). If folds are supplied, the nfold and stratified parameters would be ignored.
那么,我是否只指定用于训练模型的索引并假设其余的用于测试?例如,如果我的训练数据类似于
So, do I just specify the indices for training the model and assume the rest will be for testing? For example, if my training data is something like
Feature1 Feature2 Target
1: 2 10 10
2: 7 1 9
3: 8 2 3
4: 8 10 7
5: 8 2 9
6: 3 7 3
我想使用(训练,测试)索引作为((1,2,3),(4,5,6))和((4,5,6),(1,2,3)进行交叉验证)) 我要设置 folds=list(c(1,2,3), c(4,5,6))
吗?
and I want to cross validate using (train, test) indices as ((1,2,3), (4,5,6)) and ((4,5,6), (1,2,3)) do I set folds=list(c(1,2,3), c(4,5,6))
?
推荐答案
通过反复试验,我发现 xgboost
正在使用传递的索引作为 test 折叠.通过注意到 xgboost
的当前开发版本在 文档.
Through some trial and error I figured out that xgboost
is using the passed indices as indices of the test folds. Confirmed this by noticing the current devel version of xgboost
explicitly states it in the documentation.
这篇关于如何在 XGBoost 中使用自定义交叉验证折叠的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!