Caret交叉验证中每个折叠的测试集和训练集 [英] Test set and train set for each fold in Caret cross validation

查看:317
本文介绍了Caret交叉验证中每个折叠的测试集和训练集的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图了解Caret软件包中的5折交叉验证算法,但是我找不到如何获得每折的训练集和测试集的方法,也无法从类似的建议问题中找到这一点。想象一下,如果我想通过随机森林方法进行交叉验证,我将执行以下操作:

I tried to understand the 5 fold cross validation algorithm in Caret package but I could not find out how to get train set and test set for each fold and I also could not find this from the similar suggested questions. Imagine if I want to do cross validation by random forest method, I do the following:

set.seed(12)
train_control <- trainControl(method="cv", number=5,savePredictions = TRUE)
rfmodel <- train(Species~., data=iris, trControl=train_control, method="rf")
first_holdout <- subset(rfmodel$pred, Resample == "Fold1")
str(first_holdout)
'data.frame':   90 obs. of  5 variables:
$ pred    : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1     
$ obs     : Factor w/ 3 levels "setosa","versicolor",..: 1 1 1 1 1 1 1 1 1 1 
$ rowIndex: int  2 3 9 11 25 29 35 36 41 50 ...
$ mtry    : num  2 2 2 2 2 2 2 2 2 2 ...
$ Resample: chr  "Fold1" "Fold1" "Fold1" "Fold1" ...

Fold1中的这90个观测值是否用作训练集?如果是,那么此折叠的测试集在哪里?

Are these 90 observations in Fold1 used as training set? If yes then where is the test set for this fold?

推荐答案

 str(rfmodel)

执行的模型以下面的形式存储所有内容。下面的 control index 和<$ c $中存储去训练的样本的索引以及相应的保留值c> indexOut 。

Model performed stores everything in the below form. control in the below stores the indexes for samples that went to Train and respective hold outs in index and indexOut.

 names(rfmodel)
 #  [1] "method"       "modelInfo"    "modelType"    "results"      "pred"        
 #  [6] "bestTune"     "call"         "dots"         "metric"       "control"     
 # [11] "finalModel"   "preProcess"   "trainingData" "resample"     "resampledCM" 
 # [16] "perfNames"    "maximize"     "yLimits"      "times"        "levels"      
 # [21] "terms"        "coefnames"    "xlevels" 

训练样本和保留样本索引的路径

 # Indexes of Hold Out Sets
 rfmodel$control$indexOut

 # Indexes of Train Sets for above hold outs
 rfmodel$control$index

这篇关于Caret交叉验证中每个折叠的测试集和训练集的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆