OpenCV SIFT描述符关键点半径 [英] OpenCV SIFT descriptor keypoint radius

查看:253
本文介绍了OpenCV SIFT描述符关键点半径的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在研究 OpenCV对SIFT描述符提取的实现.我遇到了一些令人费解的代码,以获取兴趣点邻域的半径.以下是带注释的代码,其中变量名已更改为更具描述性:

I was digging into OpenCV's implementation of SIFT descriptor extraction. I came upon some puzzling code to get the radius of the interest point neighborhood. Below is the annotated code, with variable names changed to be more descriptive:

// keep octave below 256 (255 is 1111 1111)
int octave = kpt.octave & 255;
// if octave is >= 128, ...????
octave = octave < 128 ? octave : (-128 | octave);
// 1/2^absval(octave)
float scale = octave >= 0 ? 1.0f/(1 << octave) : (float)(1 << -octave);
// multiply the point's radius by the calculated scale
float scl = kpt.size * 0.5f * scale;
// the constant sclFactor is 3 and has the following comment:
// determines the size of a single descriptor orientation histogram
float histWidth = sclFactor * scl;
// descWidth is the number of histograms on one side of the descriptor
// the long float is sqrt(2)
int radius = (int)(histWidth * 1.4142135623730951f * (descWidth + 1) * 0.5f);

我知道这与转换成获取兴趣点的比例有关(我已经阅读了Lowe的论文),但是我无法将点连接到代码上.具体来说,我不了解前三行和最后一行.

I understand that this has something to do with converting to the scale from which the interest point was taken (I have read Lowe's paper), but I can't connect the dots to the code. Specifically, I don't understand the first 3 lines and last line.

我需要理解这一点才能为运动创建类似的局部点描述符.

I need to understand this to create a similar local point descriptor for motion.

推荐答案

我不懂前三行

I don't understand the first 3 lines

此SIFT实现确实编码 KeyPoint octave属性中的多个值.如果您参考第439 行,您会看到:

Indeed this SIFT implementation encodes several values within the KeyPoint octave attribute. If you refer to the line 439 you can see that:

kpt.octave = octv + (layer << 8) + (cvRound((xi + 0.5)*255) << 16);

这意味着八度存储在第一个字节块中,第二层存储在层中,等等.

Which means the octave is stored within the first byte block, the layer within the second byte block, etc.

因此kpt.octave & 255(可以在unpackOctave方法中找到)只是掩盖了关键点八度,以检索有效的八度值.

So kpt.octave & 255 (which can be found within the unpackOctave method) just masks out the keypoint octave to retrieve the effective octave value.

也:此SIFT实现使用负的第一个八度(int firstOctave = -1)处理较高分辨率的图像.由于八度索引从0开始,因此将计算映射:

Also: this SIFT implementation uses a negative first octave (int firstOctave = -1) to work with an higher resolution image. Since the octave indices start at 0, a mapping is computed:

octave index = 0 => 255
octave index = 1 => 0
octave index = 2 => 1
...

此映射是在第790行中计算的:

kpt.octave = (kpt.octave & ~255) | ((kpt.octave + firstOctave) & 255);

因此,上面的第二行只是一种映射这些值的方法:

Thus the second line above is just a way to map back these values:

octave = 255 => -1
octave = 0   => 0
octave = 1   => 1
..

第三行只是计算比例尺的一种方法,考虑到负八度音阶的比例尺> 1,例如1 << -octaveoctave = -1的比例尺给出2,这意味着它将尺寸加倍.

And the third line is just a way to compute the scale, taking into account that negative octaves give a scale > 1, e.g 1 << -octave gives 2 for octave = -1 which means it doubles the size.

[我不明白]最后一行.

[I don't understand] last line.

基本上,它对应于包裹着尺寸为D的平方块的圆的半径,因此sqrt(2)并除以2.D通过乘以计算:

Basically it corresponds to the radius of a circle that wraps a squared patch of dimension D, hence the sqrt(2) and the division by 2. D is computed by multiplying:

  • 关键点量表,
  • 放大倍数= 3,
  • 描述符直方图的宽度= 4,四舍五入到下一个整数(因此为+1)

实际上,您可以在 vlfeat的SIFT实现中找到详细说明:

Indeed you can find a detailed description within vlfeat's SIFT implementation:

每个空间单元的支持范围扩展为SBP = 3sigma 像素,其中sigma是关键点的比例.因此,所有 垃圾箱在一起具有SBP x NBP像素宽的支持.自从 使用像素的加权和插值,支持扩展 另一半垃圾箱因此,支撑是一个方形的窗口 SBP x(NBP + 1)像素.最后,由于补丁可以 任意旋转,我们需要考虑一个窗口2W + = sqrt(2)x SBP x(NBP + 1)像素宽.

The support of each spatial bin has an extension of SBP = 3sigma pixels, where sigma is the scale of the keypoint. Thus all the bins together have a support SBP x NBP pixels wide. Since weighting and interpolation of pixel is used, the support extends by another half bin. Therefore, the support is a square window of SBP x (NBP + 1) pixels. Finally, since the patch can be arbitrarily rotated, we need to consider a window 2W += sqrt(2) x SBP x (NBP + 1) pixels wide.

最后,我强烈建议您参考 vlfeat SIFT文档.

At last I greatly recommend you to refer to this vlfeat SIFT documentation.

这篇关于OpenCV SIFT描述符关键点半径的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆