如果要读取随机森林(opencv3.0)的xml文件,则标签类型必须为float, [英] The label type must be float if you want to read the xml files of random forest(opencv3.0)

查看:620
本文介绍了如果要读取随机森林(opencv3.0)的xml文件,则标签类型必须为float,的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

#include <opencv2/core.hpp>
#include <opencv2/ml.hpp>

#include <iostream>
#include <vector>

int main()
{
    size_t const FeatureSize = 24;
    {
        auto rtrees = cv::ml::RTrees::create();
        rtrees->setMaxDepth(10);
        rtrees->setMinSampleCount(2);
        rtrees->setRegressionAccuracy(0);
        rtrees->setUseSurrogates(false);
        rtrees->setMaxCategories(16);
        rtrees->setPriors(cv::Mat());
        rtrees->setCalculateVarImportance(false);
        rtrees->setActiveVarCount(0);
        rtrees->setTermCriteria({ cv::TermCriteria::MAX_ITER, 100, 0 });

        std::vector<float> labels; //#1
        cv::Mat_<float> features;        
        for(size_t i = 0; i != 500; ++i){
            std::vector<float> data;
            for(size_t j = 0; j != FeatureSize; ++j){
                data.emplace_back(0); //#2
            }
            labels.emplace_back(i % 2);
            features.push_back(cv::Mat(data, true));
        }                

        rtrees->train(features.reshape(1, labels.size()),
                      cv::ml::ROW_SAMPLE, labels);
        rtrees->write(cv::FileStorage("smoke_classifier.xml",
                                  cv::FileStorage::WRITE));
    }

    {
        auto rtrees2 = cv::ml::RTrees::create();

        cv::FileStorage read("smoke_classifier.xml",
                             cv::FileStorage::READ);
        rtrees2->read(read.root());

        int a = rtrees2->getMinSampleCount();
        std::cout<<"a == "<<a<<"\n";
        cv::Mat1f feat2(1, FeatureSize, 0.f);
        std::cout<<"predict == "<<rtrees2->predict(feat2)<<"\n";
    }  
} 



如果将#1从float更改为int, xml然后调用预测,程序会崩溃,但是如果我不从xml读取信息,函数predict可以工作,即使#1类型是int

If you change #1 from float to int and read the xml then call predict, the program will crash, but if I do not read the information from xml, the function predict can work can be done even the #1 type is int

但是如果我将标签从int更改为float,rtree将弹出另一个错误消息,当我调用train训练机器(#2的虚拟数据0不会导致程序崩溃,但真正的数据会)。

But if I change the labels from int to float, the rtree will pop out another error messages when I call train to train the machine(the dummy data "0" of the code snippet(#2) will not cause the program crash, but the real data will).

另一个问题是,将标签从int更改为float将使其从分类到回归问题,但我真正需要的是分类而不是回归(虽然很容易通过回归来模拟分类,因为只有两个标签)

The other problem is, change the labels from int to float will make it from classification to regression problem, but what I really need is classification but not regression(although it is easy to mimic classification by regression since there are only two labels)

当我将标签更改为float和call tr​​ain时,错误消息训练机

The error messages when I change the labels to float and call train to train the machine

.... \opencv-3.0.0\sources\modules\ml\src\tree.cpp:1190:错误:(-215)(int)_sleft.size() n&& (int)_sright.size()< n in function cv :: ml :: DTreesImpl :: calcDir

"....\opencv-3.0.0\sources\modules\ml\src\tree.cpp:1190: error: (-215) (int)_sleft.size() < n && (int)_sright.size() < n in function cv::ml::DTreesImpl::calcDir"

推荐答案

相关代码位于 tree.cpp

The relevant code is in tree.cpp.

使用 int 标签时,此行会导致崩溃:

When using int labels, this line will cause the crash:

float DTreesImpl::predictTrees( const Range& range, const Mat& sample, int flags ) const
{
    ...
    if( predictType == PREDICT_MAX_VOTE ) {
    ...
        sum = (flags & RAW_OUTPUT) ? (float)best_idx : classLabels[best_idx]; // Line 1487
    ...
    }
}

当使用 float 为空(即使它在xml文件中) c>标签,则此行不会执行,因为 predictType 会是 PREDICT_SUM ,而不是 PREDICT_MAX_VOTE (相关代码在同一个函数中)。

When using float labels, this line won't be executed, since predictType would be PREDICT_SUM instead of PREDICT_MAX_VOTE. (The relevant code is in the same function).

原因是文件未正确加载strong>这可能是一个错误)。事实上,当读取文件时有这个检查

The cause for this is that the file is not loaded correctly (this may be a bug). In fact, when reading the file there is this check

void DTreesImpl::readParams( const FileNode& fn )
{
    ...
    int format = 0; // line 1720
    fn["format"] >> format;
    bool isLegacy = format < 3;
    ...
    if (isLegacy) { ... }
    else 
    {
        ...
        fn["class_labels"] >> classLabels;            
    }
}

但是在写入文件时,不在这里。因此,您实际上是以错误的格式读取文件,因为您输入的是 isLegacy 部分。

but when writing the file, the field "format" is not there. So, you are in fact reading the file in the wrong format, because you enter the isLegacy part.

这是一种解决方法,将文件另存为:

A workaround for this, is to save the file as:

...
std::vector<int> labels;
...
rtrees->write(cv::FileStorage("smoke_classifier.xml", cv::FileStorage::WRITE));
// Add this
{
    cv::FileStorage fs("smoke_classifier.xml", cv::FileStorage::APPEND);
    fs << "format" << 3; // So "isLegacy" return false;
}

cv::FileStorage read("smoke_classifier.xml",
                     cv::FileStorage::READ);
auto rtrees2 = cv::ml::RTrees::create();
rtrees2->read(read.root());

这样做会正确加载文件,程序不会崩溃。

Doing so, the file will be loaded correctly, and the program won't crash.

由于我无法在 calcDir 中重现您的其他问题,请让我知道这是否有效。

Since I'm not able to reproduce your other problem in calcDir, let me know if this works.

这篇关于如果要读取随机森林(opencv3.0)的xml文件,则标签类型必须为float,的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆