在训练鲁棒级联分类器时要考虑的建议? [英] Advice to consider when training a robust cascade classifier?

查看:224
本文介绍了在训练鲁棒级联分类器时要考虑的建议?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我训练一个级联分类器,以检测图像中的动物。不幸的是,我的假阳性率相当高(超高使用Haar和LBP,使用HOG可接受)。

I'm training a cascade classifier in order to detect animals in images. Unfortunately my false positive rate is quite high (super high using Haar and LBP, acceptable using HOG). I'm wondering how I could possibly improve my classifier.

这是我的问题:


  • 为稳健检测所需的训练样本的数量是多少?我读过的地方需要4000位和800位负样本。这是一个好的估计吗?

  • 训练样本应该有多大的不同?是否有一种方法来量化图像差异,以便包含/排除可能的重复数据?

  • 我应该如何处理遮挡的对象?我应该训练只有动物的可见部分,还是我应该选择我的投资回报率,使平均投资回报率是恒定的?

  • 重新遮挡物体:动物有腿,手臂,尾巴,头等。由于一些身体部分往往被遮挡相当频繁,选择躯干作为ROI有意义吗?

  • 我应该尝试缩小我的图片并在较小的图片尺寸上进行训练吗?

  • what is the amount of training samples that is necessary for a robust detection? I've read somewhere that 4000 pos and 800 neg samples are needed. Is that a good estimate?
  • how different should the training samples be? Is there a way to quantify image difference in order to include / exclude possible 'duplicate' data?
  • how should I deal with occluded objects? should I train only the part of the animal that is visible, or should I rather pick my ROI so that the average ROI is quite constant?
  • re occluded objects: animals have legs, arms, tails, heads etc. Since some body parts tend to be occluded quite often, does it make sense to select the 'torso' as the ROI?
  • should I try to downscale my images and train on smaller images sizes? Could this possibly improve things?

我可以在这里打开任何指针!

I'm open for any pointers here!

推荐答案


  • 4000 pos - 800 neg是一个坏比率。负样本的事情是,你需要训练你的系统尽可能多的,因为 Adaboost ML算法 - 核心算法,所有haar像特征选择过程 - 高度依赖于他们。使用4000/10000将是一个很好的增强。

  • 检测动物是一个棘手的问题。由于您的问题是一个已经 NP-hard 的决策过程,因此您的分类范围越来越复杂。从猫开始。有一个检测猫的系统。然后应用相同的狗。有例如40个系统,检测不同的动物,并在以后为您的目的使用它们。

  • 对于训练,不要使用遮挡物体作为阳性。即,如果你想检测前脸,那么只需要应用位置和方向改变来训练前额面,而不在其前面包括任何其他对象。

  • 缩小不像haar分类器那样重要本身将一切降级为24x24。

  • 祝你好运。

    • 4000 pos - 800 neg is a bad ratio. The thing with negative samples is that you need to train your system as many of them as possible, since Adaboost ML algorithm -the core algorithm for all haar like feature selection processes- depends highly on them. Using 4000 / 10000 would be a good enhancement.
    • Detecting "animals" is a hard problem. Since your problem is a decision process, which is already NP-hard, you are increasing complexity with your range of classification. Start with cats first. Have a system that detects cats. Then apply the same to the dogs. Have, say 40 systems, detecting different animals and use them for your purpose later on.
    • For training, do not use occluded objects as positives. i.e. if you want to detect frontal faces, then train frontal faces with only applying position and orientation changes, without including any other object in front of it.
    • Downscaling is not important as the haar classifier itself downscales everything to 24x24. Watch whole viola-jones presentation when you have enough time.
    • Good luck.
    • 这篇关于在训练鲁棒级联分类器时要考虑的建议?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆