10倍交叉验证 [英] 10 fold cross validation

查看:789
本文介绍了10倍交叉验证的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在k重,我们有这样的: 你将数据分成k个子集 (大致)相同的大小。你训练网络k次,每次离开 一颗颗从训练的子集,但仅使用省略子集 无论计算误差标准感兴趣的内容。如果k等于样本 大小,这就是所谓的留一法交叉验证。 离开-V-OUT是 更精细,更昂贵的版本交叉验证的,涉及 离开了诉案件的所有可能的子集。

这个词的培训和考试是什么意思?我不明白。

请你告诉我一些参考,我可以学习这个算法的例子吗?

在折叠

 火车分类:2 3 4 5 6 7 8 9 10;对折测试:1
上折叠列车分类:1 3 4 5 6 7 8 9 10;对折试验:2
在褶皱火车分类:1 2 3 4 5 6 7 8 9 10;对折测试:3
在褶皱火车分类:1 2 3 5 6 7 8 9 10;对折测试:4
在褶皱火车分类:1 2 3 4 5 6 7 8 9 10;对折测试:5
在褶皱火车分类:1 2 3 4 5 6 7 8 9 10;对折测试:6
在褶皱火车分类:1 2 3 4 5 6 8 9 10;对折测试:7
在褶皱火车分类:1 2 3 4 5 6 7 8 9 10;对折测试:8
在褶皱火车分类:1 2 3 4 5 6 7 8 9 10;对折测试:9
在褶皱火车分类:1 2 3 4 5 6 7 8 9;对折测试:10
 

解决方案

在短期: 的训练是提供反馈的算法,以便调整分类器(多个)的predictive功率它产生的过程。

<强>测试是确定所述分类器(多个)是由算法产生的现实准确性的过程。在测试过程中,分类器(S)给出的数据以前从未见过的情况下,做最后的确认分类的准确度不从截然不同的在训练。

不过,你错过了中间的关键步骤:验证(也就是你指的是在10倍/ K折交叉验证的)。

验证(通常)执行的每个训练阶段后,它是为了帮助确定分类被过​​度拟合进行。验证步骤不提供任何反馈的算法,以便调整所述分类,但它有助于确定是否过拟合发生并且它标志着当训练应终止<​​/ P>

想想这个过程通过以下方式:

  1。培训训练数据集。
2.验证的验证数据集。
如果(变化验证准确性大于0)
   3.重复步骤1和2
其他
   3.停止训练
4.测试的测试数据集。
 

In k fold we have this: you divide the data into k subsets of (approximately) equal size. You train the net k times, each time leaving out one of the subsets from training, but using only the omitted subset to compute whatever error criterion interests you. If k equals the sample size, this is called "leave-one-out" cross-validation. "Leave-v-out" is a more elaborate and expensive version of cross-validation that involves leaving out all possible subsets of v cases.

what the Term training and testing mean?I can't understand.

would you please tell me some references where I can learn this algorithm with an example?

Train classifier on folds: 2 3 4 5 6 7 8 9 10; Test against fold: 1
Train classifier on folds: 1 3 4 5 6 7 8 9 10; Test against fold: 2
Train classifier on folds: 1 2 4 5 6 7 8 9 10; Test against fold: 3
Train classifier on folds: 1 2 3 5 6 7 8 9 10; Test against fold: 4
Train classifier on folds: 1 2 3 4 6 7 8 9 10; Test against fold: 5
Train classifier on folds: 1 2 3 4 5 7 8 9 10; Test against fold: 6
Train classifier on folds: 1 2 3 4 5 6 8 9 10; Test against fold: 7
Train classifier on folds: 1 2 3 4 5 6 7 9 10; Test against fold: 8
Train classifier on folds: 1 2 3 4 5 6 7 8 10; Test against fold: 9
Train classifier on folds: 1 2 3 4 5 6 7 8 9;  Test against fold: 10  

解决方案

In short: Training is the process of providing feedback to the algorithm in order to adjust the predictive power of the classifier(s) it produces.

Testing is the process of determining the realistic accuracy of the classifier(s) which were produced by the algorithm. During testing, the classifier(s) are given never-before-seen instances of data to do a final confirmation that the classifier's accuracy is not drastically different from that during training.

However, you're missing a key step in the middle: the validation (which is what you're referring to in the 10-fold/k-fold cross validation).

Validation is (usually) performed after each training step and it is performed in order to help determine if the classifier is being overfitted. The validation step does not provide any feedback to the algorithm in order to adjust the classifier, but it helps determine if overfitting is occurring and it signals when the training should be terminated.

Think about the process in the following manner:

1. Train on the training data set.
2. Validate on the validation data set.
if(change in validation accuracy > 0)
   3. repeat step 1 and 2
else
   3. stop training
4. Test on the testing data set.

这篇关于10倍交叉验证的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆