OpenCV 神经网络 Sigmoid 输出 [英] OpenCV Neural Network Sigmoid Output

查看:56
本文介绍了OpenCV 神经网络 Sigmoid 输出的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用 OpenCV 已经有一段时间了.我最近决定检查它对机器学习的能力.所以我最终实现了一个用于人脸识别的神经网络.总结一下我的人脸识别策略:

  1. 从某个人脸数据库的 csv 中读取图像.
  2. 按行将图像滚动到 Mat 数组.
  3. 应用 PCA 进行降维.
  4. 使用 PCA 的投影来训练网络.
  5. 使用经过训练的网络预测测试数据.

    所以在预测阶段之前一切都很好.我正在使用最大响应输出单元对人脸进行分类.所以通常 OpenCV 的 sigmoid 实现应该给出 -1 到 1 范围内的值,这在文档中说明.1 是类的最大闭包.在获得接近 0 的准确率后,我检查了每个测试数据的每个类的输出响应.我对这些值感到惊讶: 14.53, -1.7 , #IND .如果应用了 sigmoid,我怎么能得到这些值?我哪里做错了?

    为了帮助您理解这个问题以及那些想知道如何应用 PCA 并将其与 NN 一起使用的人,我正在分享我的代码:

读取 csv:

<前>void read_csv(const string&filename, vector&images, vector&labels, char separator = ';'){std::ifstream 文件(文件名.c_str(), ifstream::in);如果(!文件){string error_message = "没有给出有效的输入文件,请检查给定的文件名.";CV_Error(1, error_message);}字符串行、路径、类标签;while (getline(file, line)){字符串流线(线);getline(线,路径,分隔符);getline(线条,类标签);if(!path.empty() && !classlabel.empty()){Mat im = imread(path, 0);images.push_back(im);标签.push_back(atoi(classlabel.c_str()));}}}

逐行滚动图像:

Mat rollVectortoMat(const vector &data){Mat dst(static_cast(data.size()), data[0].rows*data[0].cols, CV_32FC1);for(unsigned int i = 0; i < data.size(); i++){Mat image_row = data[i].clone().reshape(1,1);垫 row_i = dst.row(i);image_row.convertTo(row_i,CV_32FC1, 1/255.);}返回 dst;}

将标签向量转换为标签矩阵

Mat getLabels(const vector &data,int classes = 20){垫标签(data.size(),classes,CV_32FC1);for(int i = 0; i (i,cls) = 1.0;}退货标签;}

主要

int main(){PCA PCA;矢量<Mat>图像火车;矢量<Mat>图像测试;向量标签火车;向量标签测试;read_csv("train1k.txt",images_train,labels_train);read_csv("test1k.txt",images_test,labels_test);Mat rawTrainData = rollVectortoMat(images_train);Mat rawTestData = rollVectortoMat(images_test);Mat trainLabels = getLabels(labels_train);Mat testLabels = getLabels(labels_test);int pca_size = 500;Mat trainData(rawTrainData.rows, pca_size,rawTrainData.type());Mat testData(rawTestData.rows,pca_size,rawTestData.type());pca(rawTrainData,Mat(),CV_PCA_DATA_AS_ROW,pca_size);for(int i = 0; i < rawTrainData.rows ; i++)pca.project(rawTrainData.row(i),trainData.row(i));for(int i = 0; i < rawTestData.rows ; i++)pca.project(rawTestData.row(i),testData.row(i));垫层 = Mat(3,1,CV_32SC1);int sz = trainData.cols ;layer.row(0) = 标量(sz);layer.row(1) = 标量(1000);layer.row(2) = 标量(20);CvANN_MLP mlp;CvANN_MLP_TrainParams 参数;CvTermCriteria 标准;标准.max_iter = 1000;标准.epsilon = 0.00001f;标准.type = CV_TERMCRIT_ITER |CV_TERMCRIT_EPS;params.train_method = CvANN_MLP_TrainParams::BACKPROP;params.bp_dw_scale = 0.1f;params.bp_moment_scale = 0.1f;params.term_crit = 标准;mlp.create(层,CvANN_MLP::SIGMOID_SYM);int i = mlp.train(trainData,trainLabels,Mat(),Mat(),params);int t = 0, f = 0;for(int i = 0; i < testData.rows ; i++){垫响应(1,20,CV_32FC1);垫样本 = testData.row(i);mlp.predict(样本,响应);浮动最大值 = -1000000000000.0f;int cls = -1;for(int j = 0 ; j <20 ; j++){浮点值 = response.at(0,j);如果(值 > 最大值){最大值 = 值;cls = j + 1;}}if(cls == labels_test[i])t++;别的f++;}返回0;}

注意:我将 AT&T 的前 20 个类用于我的数据集.

解决方案

感谢 Canberk Ba​​ci 的评论,我设法克服了 sigmoid 输出差异.问题似乎出在 mlp 的 create 函数的默认参数上,该函数将 alpha 和 beta 0 作为默认值.当它们都为 1 时,sigmoid 函数按照文档中的说明工作,神经网络可以预测一些东西,但当然会有错误.

对于神经网络的结果:

通过修改一些参数,如动量等,并且没有任何光照校正算法,我在来自 opencv 教程的前 20 类 CroppedYaleB(随机采样的 936 个火车,262 个测试图像)的数据集上获得了 %72 的准确率.对于其他提高准确性的因素;当我应用 PCA 时,我直接将减少的维度大小设置为 500.这也可能会降低准确性,因为保留方差可能低于 %95 或更糟.因此,当我有空闲时间时,我会应用这些来提高准确性:

  1. Tan Triggs 照明校正
  2. 使用 0.95 作为 pca 大小训练 PCA 以保留 %95 方差.
  3. 修改神经网络参数(我希望我们在 OpenCV 库中有一个参数更少的神经网络)

我分享了这些,所以有人可能想知道如何提高 NN 的分类准确率.希望能帮到你.

顺便说一下,您可以在此处跟踪有关此问题的信息:http://code.opencv.org/issues/3583

I have been using OpenCV for a quite time. I decided to check its power for Machine Learning lately. So I ended up with implementing a neural network for face recognition. To summarize my strategy for face recognition :

  1. Read images from a csv of some face database.
  2. Roll images to a Mat array row wise.
  3. Apply PCA for dimensionality reduction.
  4. Use projections of PCA to train the network.
  5. Predict the test data using the trained network.

    So everything was OK until the prediction stage. I was using the max responsed output unit to classify the face. So normally OpenCV's sigmoid implementation should give values in range of -1 to 1 which is stated at the docs. 1 is the max closure to class. After I got nearly 0 accuracy I checked the output responses for each class for each test data. I was suprised with the values : 14.53, -1.7 , #IND . If sigmoid was applied, how could i get these values ? Where am i doing wrong ?

    To help you understand the matter and for the ones wondering how to apply PCA and use it with NN I m sharing my code :

Reading csv:

void read_csv(const string& filename, vector& images, vector& labels, char separator = ';') 
{
    std::ifstream file(filename.c_str(), ifstream::in);
    if (!file) 
    {
        string error_message = "No valid input file was given, please check the given filename.";
        CV_Error(1, error_message);
    }
    string line, path, classlabel;
    while (getline(file, line)) 
    {
        stringstream liness(line);

        getline(liness, path, separator);
        getline(liness, classlabel);

        if(!path.empty() && !classlabel.empty()) 
        {
            Mat im = imread(path, 0);

            images.push_back(im);
            labels.push_back(atoi(classlabel.c_str()));
        }
    }
}

Rolling images row by row :

Mat rollVectortoMat(const vector<Mat> &data)
{
   Mat dst(static_cast<int>(data.size()), data[0].rows*data[0].cols, CV_32FC1);
   for(unsigned int i = 0; i < data.size(); i++)
   {
      Mat image_row = data[i].clone().reshape(1,1);
      Mat row_i = dst.row(i);                                       
      image_row.convertTo(row_i,CV_32FC1, 1/255.);
   }
   return dst;
}

Converting vector of labels to Mat of labels

Mat getLabels(const vector<int> &data,int classes = 20)
{
    Mat labels(data.size(),classes,CV_32FC1);

    for(int i = 0; i <data.size() ; i++)
    {
        int cls = data[i] - 1;  
        labels.at<float>(i,cls) = 1.0;  
    }

    return labels;
}

MAIN

int main()
{

    PCA pca;

    vector<Mat> images_train;
    vector<Mat> images_test;
    vector<int> labels_train;
    vector<int> labels_test;

    read_csv("train1k.txt",images_train,labels_train);
    read_csv("test1k.txt",images_test,labels_test);

    Mat rawTrainData = rollVectortoMat(images_train);                       
    Mat rawTestData  = rollVectortoMat(images_test);                

    Mat trainLabels = getLabels(labels_train);
    Mat testLabels  = getLabels(labels_test);

    int pca_size = 500;

    Mat trainData(rawTrainData.rows, pca_size,rawTrainData.type());
    Mat testData(rawTestData.rows,pca_size,rawTestData.type());


    pca(rawTrainData,Mat(),CV_PCA_DATA_AS_ROW,pca_size);

    for(int i = 0; i < rawTrainData.rows ; i++)
        pca.project(rawTrainData.row(i),trainData.row(i));

    for(int i = 0; i < rawTestData.rows ; i++)
        pca.project(rawTestData.row(i),testData.row(i));



    Mat layers = Mat(3,1,CV_32SC1);
    int sz = trainData.cols ;

    layers.row(0) = Scalar(sz);
    layers.row(1) = Scalar(1000);
    layers.row(2) = Scalar(20);

    CvANN_MLP mlp;
    CvANN_MLP_TrainParams params;
    CvTermCriteria criteria;

    criteria.max_iter = 1000;
    criteria.epsilon  = 0.00001f;
    criteria.type     = CV_TERMCRIT_ITER | CV_TERMCRIT_EPS;

    params.train_method    = CvANN_MLP_TrainParams::BACKPROP;
    params.bp_dw_scale     = 0.1f;
    params.bp_moment_scale = 0.1f;
    params.term_crit       = criteria;

    mlp.create(layers,CvANN_MLP::SIGMOID_SYM);
    int i = mlp.train(trainData,trainLabels,Mat(),Mat(),params);

    int t = 0, f = 0;

    for(int i = 0; i < testData.rows ; i++)
    {
        Mat response(1,20,CV_32FC1);
        Mat sample = testData.row(i);

        mlp.predict(sample,response);

        float max = -1000000000000.0f;
        int cls = -1;

        for(int j = 0 ; j < 20 ; j++)   
        {
            float value = response.at<float>(0,j);

            if(value > max)
            {
                max = value;
                cls = j + 1;
            }
        }

        if(cls == labels_test[i])
            t++;
        else
            f++;
    }


    return 0;
}

NOTE: I used AT&T 's first 20 class for my dataset.

解决方案

Thanks to Canberk Baci's comment I managed to overcome sigmoid output discrepancy. Problem seems to be at default parameters of mlp 's create function which takes alpha and beta 0 as default. When they both are given as 1, sigmoid function works as it was stated in the docs and neural network can predict something but with errors of course.

And for the results of Neural Network :

By modifying some parameters like momentum etc, and without any illumunation correction algorithm, I got %72 accuracy on the dataset of (randomly sampled 936 train, 262 test images ) first 20 classes of CroppedYaleB from opencv tutorials. For the other factors to increase accuracy; when I applied PCA, I directly gave the reduced dimension size as 500. This may also reduce accuracy because retained variance may be below %95 or worse. So when I have free time I will apply these to increase accuracy :

  1. Tan Triggs illumination correction
  2. Train PCA with 0.95 as pca size to retain %95 variance.
  3. Modify neural network parameters (I wish we had a less parametric NN in OpenCV library)

I shared these so that someone may wonder how to increase the classification accuracy of NN. I hope it helps.

By the way you can track the issue about this here: http://code.opencv.org/issues/3583

这篇关于OpenCV 神经网络 Sigmoid 输出的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆