R 中的分层 k 折交叉验证 [英] Stratified k-fold Cross Validation in R

查看:69
本文介绍了R 中的分层 k 折交叉验证的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有一个多类数据集(例如 iris).我想执行分层的 10 倍 CV 来测试模型性能.我在包 splitstackchange 中找到了一个名为 stratified 的函数,它根据我想要的数据比例给我一个分层的折叠.因此,如果我想要一个测试折叠,它将是 0.1 个数据行.

Suppose I have a multiclass dataset (iris for example). I want to perform a stratified 10 fold CV to test model performance. I found a function in the package splitstackchange called stratified that gives me a stratified fold based on the proportion of the data I want. So if I want a testing fold it would be 0.1 of the data rows.

#One Fold
library(splitstackchange)
stratified(iris,c("Species"),0.1)

我想知道如何在 10 倍循环中实现此功能或任何其他形式的分层 cv.我无法破解它背后的逻辑.在这里,我提供了一个可重现的示例.

I want to know how to implement this function or any other form of stratified cv in a 10-fold loop. I couldn't crack the logic behind it. Here I include a reproducible example.

    library(splitstackshape)
    data=iris
    names(data)[ncol(data)]=c("Y")
    nFolds=10

    for (i in 1:nFolds){
      testing=stratified(data,c("Y"),0.1,keep.rownames=TRUE)
      rn=testing$rn
      testing=testing[,-"rn"]
      row.names(testing)=rn
      trainingRows=setdiff(1:nrow(data),as.numeric(row.names(testing)))
      training=data[trainingRows,]
      names(training)[ncol(training)]="Y"
    }

推荐答案

很晚了,但我希望我能帮助别人.示例代码对我有帮助:

it's late but i hope i can help someone. The sample code helps me:

library(splitstackshape)


dat1 <- data.frame(ID = 1:100,
              A = sample(c("AA", "BB", "CC", "DD", "EE"), 100, replace = TRUE),
              B = rnorm(100), C = abs(round(rnorm(100), digits=1)),
              D = sample(c("CA", "NY", "TX"), 100, replace = TRUE),
              E = sample(c("M", "F"), 100, replace = TRUE))



flds=list()
dat=dat1

for(i in 1:10){
  j=10-(i-1)
  if(j>1){
  a=stratified(dat, c("E", "D"), size = 1/j)
  flds[[i]]=a$ID
  dat=dat%>%filter(ID %in% setdiff(dat$ID,a$ID))
  } else{
  flds[[i]]=dat$ID  
  }
}

这篇关于R 中的分层 k 折交叉验证的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆