MATLAB:不使用现有功能的10倍交叉验证 [英] MATLAB: 10 fold cross Validation without using existing functions

查看:124
本文介绍了MATLAB:不使用现有功能的10倍交叉验证的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个矩阵(我猜在MatLab中,您将其称为结构)或数据结构:

I have a matrix (I guess in MatLab you call it a struct) or data structure:

  data: [150x4 double]
labels: [150x1 double]

这是我的矩阵.数据看起来像假设我确实以矩阵名称加载文件:

here is out my matrix.data looks like assume I do load my file with the name of matrix:

5.1000    3.5000    1.4000    0.2000
4.9000    3.0000    1.4000    0.2000
4.7000    3.2000    1.3000    0.2000
4.6000    3.1000    1.5000    0.2000
5.0000    3.6000    1.4000    0.2000
5.4000    3.9000    1.7000    0.4000
4.6000    3.4000    1.4000    0.3000
5.0000    3.4000    1.5000    0.2000
4.4000    2.9000    1.4000    0.2000
4.9000    3.1000    1.5000    0.1000
5.4000    3.7000    1.5000    0.2000
4.8000    3.4000    1.6000    0.2000
4.8000    3.0000    1.4000    0.1000
4.3000    3.0000    1.1000    0.1000
5.8000    4.0000    1.2000    0.2000
5.7000    4.4000    1.5000    0.4000
5.4000    3.9000    1.3000    0.4000
5.1000    3.5000    1.4000    0.3000
5.7000    3.8000    1.7000    0.3000
5.1000    3.8000    1.5000    0.3000

这是我的矩阵.标签看起来像

And here is my matrix.labels look like

 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1
 1

我试图在不使用MatLab中任何现有功能的情况下创建10个交叉折叠验证,并且由于我对MatLab的了解非常有限,因此我无法继续使用现有的功能.任何帮助都会很棒.

I am trying to create 10 cross fold validation without using any of the existing functions in MatLab and due to my very limited MatLab knowledge I am having trouble going forward with from what I have. Any help would be great.

到目前为止,这是我所拥有的,我敢肯定这可能不是matlab的方式,但是我对matlab还是很陌生.

This is what I have so far, and I am sure this probably not the matlab way, but I am very new to matlab.

function[output] = fisher(dataFile, number_of_folds)
    data = load(dataFile);
    %create random permutation indx
    idx = randperm(150);
    output = data.data(idx(1:15),:);
end

推荐答案

这是我进行交叉验证的依据.我使用magic(10)创建虚拟数据,也随机创建标签.紧随其后的是想法,我们获取了数据和标签,并将它们与随机列组合.考虑以下伪代码.

Here is my take for this cross validation. I create dummy data using magic(10) also I create labels randomly. Idea is following , we get our data and labels and combine them with random column. Consider following dummy code.

>> data = magic(4)

data =

    16     2     3    13
     5    11    10     8
     9     7     6    12
     4    14    15     1

>> dataRowNumber = size(data,1)

dataRowNumber =

     4

>> randomColumn = rand(dataRowNumber,1)

randomColumn =

    0.8147
    0.9058
    0.1270
    0.9134


>> X = [ randomColumn data]

X =

    0.8147   16.0000    2.0000    3.0000   13.0000
    0.9058    5.0000   11.0000   10.0000    8.0000
    0.1270    9.0000    7.0000    6.0000   12.0000
    0.9134    4.0000   14.0000   15.0000    1.0000

如果我们根据第1列对X进行排序,则会对数据进行随机排序.这将为我们提供交叉验证的随机性.接下来的事情是根据交叉验证百分比除以X.只需一种简单的方法即可完成此操作.让我们考虑%75%是培训案例,%25%是测试案例.我们的大小是4,然后3/4 =%75,而1/4是%25.

If we sort X according column 1, we sort our data randomly. This will give us cross validation randomness. Then next thing is to divide X according to cross validation percentage. Accomplishing this for one case easy enough. Lets consider %75 percent is train case and %25 percent is test case. Our size here is 4, then 3/4 = %75 and 1/4 is %25.

testDataset = X(1,:)
trainDataset = X(2:4,:)

但是要完成N次交叉折叠则要困难一些.由于我们需要进行N次操作.为此,必须使用for循环. 5折交叉.我在第一个f

But accomplishing this a bit harder for N cross folds. Since we need to make this N times. For loop is necessary for this. For 5 cross folds. I get , in first f

  1. 第一折:测试1 2,火车3:10
  2. 第二折:测试3 4,火车1 2 5:10
  3. 第三折:测试5 6,火车1:4 7:10
  4. 第四折:测试7 8,火车1:6 9:10
  5. 第5折:测试10 9,火车1:8

以下代码是此过程的示例:

Following code is an example for this process:

data = magic(10);
dataRowNumber = size(data,1);
labels= rand(dataRowNumber,1) > 0.5;
randomColumn = rand(dataRowNumber,1);

X = [ randomColumn data labels];


SortedData = sort(X,1);

crossValidationFolds = 5;
numberOfRowsPerFold = dataRowNumber / crossValidationFolds;

crossValidationTrainData = [];
crossValidationTestData = [];
for startOfRow = 1:numberOfRowsPerFold:dataRowNumber
    testRows = startOfRow:startOfRow+numberOfRowsPerFold-1;
    if (startOfRow == 1)
        trainRows = [max(testRows)+1:dataRowNumber];
        else
        trainRows = [1:startOfRow-1 max(testRows)+1:dataRowNumber];
    end
    crossValidationTrainData = [crossValidationTrainData ; SortedData(trainRows ,:)];
    crossValidationTestData = [crossValidationTestData ;SortedData(testRows ,:)];

end

这篇关于MATLAB:不使用现有功能的10倍交叉验证的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆