维奥拉 - 琼斯的人脸检测要求180K功能 [英] Viola-Jones' face detection claims 180k features

查看:255
本文介绍了维奥拉 - 琼斯的人脸检测要求180K功能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直在推行维奥拉 - 琼斯的人脸检测算法的改编。该技术依赖于配售的24×24像素的子帧图像内,随后将长方形的功能在里面的每个位置与各种规模的可能。

I've been implementing an adaptation of Viola-Jones' face detection algorithm. The technique relies upon placing a subframe of 24x24 pixels within an image, and subsequently placing rectangular features inside it in every position with every size possible.

这些特征可以包括两个,三个或四个矩形。下面的例子是presented。

These features can consist of two, three or four rectangles. The following example is presented.

他们声称详尽的设定超过180K(第2节):

They claim the exhaustive set is more than 180k (section 2):

鉴于检测器的基本分辨率是24x24的,详尽的组矩形特征是相当大的,超过18万。注意,与Haar基底,该组矩形的   特点是超完备。

Given that the base resolution of the detector is 24x24, the exhaustive set of rectangle features is quite large, over 180,000 . Note that unlike the Haar basis, the set of rectangle features is overcomplete.

下面的语句中没有明确的文件中指出,所以他们在我的部分假设:

The following statements are not explicitly stated in the paper, so they are assumptions on my part:

  1. 只有2二矩形特征,2-三矩形特征和1四矩形特征。这背后的逻辑是,我们正在观察的的区别的高亮显示的矩形,没有明确的颜色或亮度或那样的话的。
  2. 我们无法定义功能型为1x1像素块;它必须至少是至少1×2像素。此外,D型必须至少2×2个像素,并且这个规则相应地保存到其他功能。
  3. 我们无法定义特征类型A作为1×3像素块的中间像素不能进行分区,并从自身减去等同于一个1x2的像素块;此功能类型只被定义为即使宽度。此外,特征类型C的宽度必须能被3整除,这个规则相应持有的其他功能。
  4. 我们无法定义为0的宽度和/或高度的特征。因此,我们反复的 X 的24减去特征的尺寸。
  1. There are only 2 two-rectangle features, 2 three-rectangle features and 1 four-rectangle feature. The logic behind this is that we are observing the difference between the highlighted rectangles, not explicitly the color or luminance or anything of that sort.
  2. We cannot define feature type A as a 1x1 pixel block; it must at least be at least 1x2 pixels. Also, type D must be at least 2x2 pixels, and this rule holds accordingly to the other features.
  3. We cannot define feature type A as a 1x3 pixel block as the middle pixel cannot be partitioned, and subtracting it from itself is identical to a 1x2 pixel block; this feature type is only defined for even widths. Also, the width of feature type C must be divisible by 3, and this rule holds accordingly to the other features.
  4. We cannot define a feature with a width and/or height of 0. Therefore, we iterate x and y to 24 minus the size of the feature.

基于这些假设,我计算了详尽的设置:

Based upon these assumptions, I've counted the exhaustive set:

const int frameSize = 24;
const int features = 5;
// All five feature types:
const int feature[features][2] = {{2,1}, {1,2}, {3,1}, {1,3}, {2,2}};

int count = 0;
// Each feature:
for (int i = 0; i < features; i++) {
    int sizeX = feature[i][0];
    int sizeY = feature[i][1];
    // Each position:
    for (int x = 0; x <= frameSize-sizeX; x++) {
        for (int y = 0; y <= frameSize-sizeY; y++) {
            // Each size fitting within the frameSize:
            for (int width = sizeX; width <= frameSize-x; width+=sizeX) {
                for (int height = sizeY; height <= frameSize-y; height+=sizeY) {
                    count++;
                }
            }
        }
    }
}

结果是 162336

我发现近似180,000中提琴与放大器的唯一方法;琼斯说,正在下降的假设#4,并通过引入错误的code。这涉及到改变四行分别为:

The only way I found to approximate the "over 180,000" Viola & Jones speak of, is dropping assumption #4 and by introducing bugs in the code. This involves changing four lines respectively to:

for (int width = 0; width < frameSize-x; width+=sizeX)
for (int height = 0; height < frameSize-y; height+=sizeY)

所以,结果是 180625 。 (请注意,这将有效地prevent从不断接触子帧的右边和/或底部的特征。)

The result is then 180,625. (Note that this will effectively prevent the features from ever touching the right and/or bottom of the subframe.)

当然现在的问题是:有他们提出在实施错了吗?这有什么意义来考虑功能为零的面?或者,我看到了错误的方式?

Now of course the question: have they made a mistake in their implementation? Does it make any sense to consider features with a surface of zero? Or am I seeing it the wrong way?

推荐答案

经仔细看看,你的code看起来是正确的我;这让人怀疑原作者是否有偏离情况的一个错误。我想一个人应该看看OpenCV中如何实现呢!

Upon closer look, your code looks correct to me; which makes one wonder whether the original authors had an off-by-one bug. I guess someone ought to look at how OpenCV implements it!

然而,一个建议,使之更易于理解是翻转的顺序的的环路,将在所有尺寸的第一,然后遍历给出的大小可能位置:

Nonetheless, one suggestion to make it easier to understand is to flip the order of the for loops by going over all sizes first, then looping over the possible locations given the size:

#include <stdio.h>
int main()
{
    int i, x, y, sizeX, sizeY, width, height, count, c;

    /* All five shape types */
    const int features = 5;
    const int feature[][2] = {{2,1}, {1,2}, {3,1}, {1,3}, {2,2}};
    const int frameSize = 24;

    count = 0;
    /* Each shape */
    for (i = 0; i < features; i++) {
        sizeX = feature[i][0];
        sizeY = feature[i][1];
        printf("%dx%d shapes:\n", sizeX, sizeY);

        /* each size (multiples of basic shapes) */
        for (width = sizeX; width <= frameSize; width+=sizeX) {
            for (height = sizeY; height <= frameSize; height+=sizeY) {
                printf("\tsize: %dx%d => ", width, height);
                c=count;

                /* each possible position given size */
                for (x = 0; x <= frameSize-width; x++) {
                    for (y = 0; y <= frameSize-height; y++) {
                        count++;
                    }
                }
                printf("count: %d\n", count-c);
            }
        }
    }
    printf("%d\n", count);

    return 0;
}

以相同的结果previous 162336


要验证它,我测试了4×4的窗口的情况下,手动检查了所有案件(容易,因为1×2/2×1×3和/ 3X1形状数是一样的只有90度旋转):

To verify it, I tested the case of a 4x4 window and manually checked all cases (easy to count since 1x2/2x1 and 1x3/3x1 shapes are the same only 90 degrees rotated):

2x1 shapes:
        size: 2x1 => count: 12
        size: 2x2 => count: 9
        size: 2x3 => count: 6
        size: 2x4 => count: 3
        size: 4x1 => count: 4
        size: 4x2 => count: 3
        size: 4x3 => count: 2
        size: 4x4 => count: 1
1x2 shapes:
        size: 1x2 => count: 12             +-----------------------+
        size: 1x4 => count: 4              |     |     |     |     |
        size: 2x2 => count: 9              |     |     |     |     |
        size: 2x4 => count: 3              +-----+-----+-----+-----+
        size: 3x2 => count: 6              |     |     |     |     |
        size: 3x4 => count: 2              |     |     |     |     |
        size: 4x2 => count: 3              +-----+-----+-----+-----+
        size: 4x4 => count: 1              |     |     |     |     |
3x1 shapes:                                |     |     |     |     |
        size: 3x1 => count: 8              +-----+-----+-----+-----+
        size: 3x2 => count: 6              |     |     |     |     |
        size: 3x3 => count: 4              |     |     |     |     |
        size: 3x4 => count: 2              +-----------------------+
1x3 shapes:
        size: 1x3 => count: 8                  Total Count = 136
        size: 2x3 => count: 6
        size: 3x3 => count: 4
        size: 4x3 => count: 2
2x2 shapes:
        size: 2x2 => count: 9
        size: 2x4 => count: 3
        size: 4x2 => count: 3
        size: 4x4 => count: 1

这篇关于维奥拉 - 琼斯的人脸检测要求180K功能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆