人脸检测算法,最少的培训时间 [英] Face Detection Algorithms with minimal training time

查看:479
本文介绍了人脸检测算法,最少的培训时间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

想问问是否有任何种类的适合于视频,将需要最少的培训时间,而不是理想的是约了几天像中提琴 - 琼斯周人脸检测方案。我看了一下LBP但它需要一组训练样本庞大的太多,但不知道需要多长时间。培训是否是LBP消耗尽可能多的时间,紫百合 - 琼斯方法与训练组类似的号码吗?我会像在Linux操作系统上运行树莓派微处理器实现这一点。将要实现它的下速度,我希望它能够检测图像的10fps的 - 20fps的视频流

解决方案

OpenCV中附带了一个名为工具的 traincascade 的那趟车LBP,哈尔和猪。专门针对人脸检测,他们甚至船3000图像的24×24像素的面数据集,在需要的 traincascade 的格式。

在我的经验中,三种类型的 traincascade 的支持,LBP花费最少的时间来训练,以小时而不是几天的哈尔的顺序。

其培训过程的简要概述是,阶段给定数量的(一个不错的选择为20),它试图找到拒绝许多非面孔下,尽量不拒绝的面孔特征。拒绝非面孔和保持面之间的平衡是由控制的的最小值命中率的(OpenCV的选择99.5%)和误报率的(OpenCV中选择50%)。用于起草OpenCV的自己的LBP级联的特定的元算法是温和的AdaBoost(GAB)。

LBP在OpenCV中实现的变种这里描述:

盛菜辽,向新朱,真累,轮涨和斯坦李正东。学习多尺度块局部二元模式的人脸识别。国际会议生物识别技术(ICB),2007年,页828-837。

什么它相当于在实践中的OpenCV采用默认参数是:

OpenCV的LBP梯级运行时概述

探测器检查图像寻找一个面内24x24的窗口。从阶段1步进到级联分类的20,如果它可以显示,当前的24x24窗口可能不是一个面,它拒绝它并移过窗口由一个或两个以上的像素到下一个位置;否则,流程进到下一个阶段。

在每个阶段,3-10左右的LBP特征进行检查。每LBP特征都有一个窗口和一个大小内的偏移,并且它覆盖的区域被完全包含在当前窗口内。在给定的位置评估一个LBP特征可导致在任一合格或不合格。根据一个LBP特征是否成功或失败,特别是该功能的正或负重量加到累加器

在所有的阶段的LBP特征进行评估,累加器的值进行比较的阈值的阶段。甲阶段失败,如果累加器低于阈值,并传递如果是上方。再次,如果一个阶段出现故障,则级联退出和窗口移动到下一个位置。

LBP功能评价是相对简单的。在那个要素的窗口中的偏移量,九矩形在一个3x3的配置布局。这九个矩形都是相同尺寸为特定的LBP特征,范围从1×1到8x8

在九个矩形的所有像素的总和计算,换句话说其积分。然后,中央矩形的积分相比,其八个邻居。这八个比较的结果为8位(1或0),其被组装在一个8位的LBP

此8位位向量被用来作为一个指数到一个2 ^ 8 == 256位的LUT,通过训练过程和特定于每个LBP特征计算,来确定是否所述的LBP特征通过还是失败。

这是所有有给它。

Wanted to ask if there was any kind of face detection scheme suitable for video that would require minimal training time ideally about a few days rather than weeks like the Viola-Jones. I have read about LBP but it requires a huge set of training samples too but not sure how long it takes. Does training an LBP consume as much time as the Viola - Jones method with a similar number of training set ?. I will be implementing this on a microprocessor like raspberry pi running on a linux OS. Will want to implement it on C for speed as I want it to be able to detect images in a 10fps - 20fps video stream.

解决方案

OpenCV ships with a tool called traincascade that trains LBP, Haar and HOG. Specifically for face detection they even ship the 3000-image dataset of 24x24 pixel faces, in the format needed by traincascade.

In my experience, of the three types traincascade supports, LBP takes the least time to train, taking on the order of hours rather than days for Haar.

A quick overview of its training process is that for the given number of stages (a decent choice is 20), it attempts to find features that reject as many non-faces as possible while not rejecting the faces. The balance between rejecting non-faces and keeping faces is controlled by the mininum hit rate (OpenCV chose 99.5%) and false alarm rate (OpenCV chose 50%). The specific meta-algorithm used for crafting OpenCV's own LBP cascade is Gentle AdaBoost (GAB).

The variant of LBP implemented in OpenCV is described here:

Shengcai Liao, Xiangxin Zhu, Zhen Lei, Lun Zhang and Stan Z. Li. Learning Multi-scale Block Local Binary Patterns for Face Recognition. International Conference on Biometrics (ICB), 2007, pp. 828-837.

What it amounts to in practice in OpenCV with default parameters is:

OpenCV LBP Cascade Runtime Overview

The detector examines 24x24 windows within the image looking for a face. Stepping from Stage 1 to 20 of the cascade classifier, if it can show that the current 24x24 window is likely not a face, it rejects it and moves over the window by one or two pixels over to the next position; Otherwise it proceeds to the next stage.

During each stage, 3-10 or so LBP features are examined. Every LBP feature has an offset within the window and a size, and the area it covers is fully contained within the current window. Evaluating an LBP feature at a given position can result in either a pass or fail. Depending on whether an LBP feature succeeds or fails, a positive or negative weight particular to that feature is added to an accumulator.

Once all of a stage's LBP features are evaluated, the accumulator's value is compared to the stage threshold. A stage fails if the accumulator is below the threshold, and passes if it is above. Again, if a stage fails, the cascade is exited and the window moves to the next position.

LBP feature evaluation is relatively simple. At that feature's offset within the window, nine rectangles are laid out in a 3x3 configuration. These nine rectangles are all the same size for a particular LBP feature, ranging from 1x1 to 8x8.

The sum of all the pixels in the nine rectangles are computed, in other words their integral. Then, the central rectangle's integral is compared to that of its eight neighbours. The result of these eight comparisons is eight bits (1 or 0), which are assembled in an 8-bit LBP.

This 8-bit bitvector is used as an index into a 2^8 == 256-bit LUT, computed by the training process and particular to each LBP feature, that determines whether the LBP feature passed or failed.

That is all there is to it.

这篇关于人脸检测算法,最少的培训时间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆