如何从OpenCV的SiftDescriptorExtractor转换描述符值? [英] How does the SiftDescriptorExtractor from OpenCV convert descriptor values?

查看:277
本文介绍了如何从OpenCV的SiftDescriptorExtractor转换描述符值?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个关于SiftDescriptorExtractor作业最后一部分的问题,



我正在做以下操作:

  SiftDescriptorExtractor extractor; 
Mat descriptors_object;
extractor.compute(img_object,keypoints_object,descriptors_object);



现在我想检查一个descriptors_object对象的元素:

  std :: cout< descriptors_object.row(1)<< std :: endl; 

输出如下所示:

  [0,0,0,0,0,0,0,0,0,0,0,0,0,0,3,3,0,0,0,0,0, 0,0,0,0,0,0,0,0,0,0,0,0,0, 51,154,20,0,0,0,0,0,154,154,1,2,1,0,0,05,1,48,18,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0, 2,154,61,0,0,0,0,5,60,154,30,0,0,0,0,34,70,6,15,3,2,1,0,14,16,18,19,20,21,22,23,24,25,26, 2,0,0,0,0,05,04,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0 6,1,0,1,0,0,0] 

但在 Lowe纸说:


因此,我们减小
大梯度幅度的影响
将单位
特征向量的值设为阈值,每个都不大
比0.2,然后重整化为
单位长度。这意味着匹配
大梯度的大小是
不再那么重要,并且
分布的方向有
更强调。 0.2的值是
,通过实验确定使用图像
,其包含对于
的相同3D对象的不同照明。


因此,来自特征向量的数字应不大于0.2的值。



问题是,如何在Mat对象中转换这些值? p>

解决方案


因此,来自特征向量的数字不应大于0.2
的值。 / p>

否。该论文说SIFT描述符是:


  1. 规范化(使用L2范数)

  2. code> 0.2 作为阈值(即循环正常化值并在适当时截断)

  3. 再次规范

因此,在理论上,任何SIFT描述符组件都在 [0,1] 之间,问题是,这些值是如何在Mat对象中转换的?

$ b(



将它们从浮点值转换为 unsigned char -s。



这里是来自OpenCV的相关部分 modules / nonfree / src / sift.cpp calcSIFTDescriptor 方法:

  float nrm2 = 0; 
len = d * d * n;
for(k = 0; k nrm2 + = dst [k] * dst [k]
float thr = std :: sqrt(nrm2)* SIFT_DESCR_MAG_THR;
for(i = 0,nrm2 = 0; i {
float val = std :: min(dst [i],thr)
dst [i] = val;
nrm2 + = val * val;
}
nrm2 = SIFT_INT_DESCR_FCTR / std :: max(std :: sqrt(nrm2),FLT_EPSILON);
for(k = 0; k {
dst [k] = saturate_cast< uchar>(dst [k] * nrm2)
}

具有:

  static const float SIFT_INT_DESCR_FCTR = 512.f;这是因为经典的SIFT实现将归一化的浮点值量化为 unsigned char()。



<
整数乘以512乘法因子,这等效于考虑任何SIFT分量在 [0,1 / 2] 之间变化,松散精度试图编码完整的 [0,1] 范围。


I have a question about the last part of the SiftDescriptorExtractor job,

I'm doing the following:

    SiftDescriptorExtractor extractor;
    Mat descriptors_object;
    extractor.compute( img_object, keypoints_object, descriptors_object );

Now I want to check the elements of a descriptors_object Mat object:

std::cout<< descriptors_object.row(1) << std::endl;

output looks like:

[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 3, 3, 0, 0, 0, 0, 0, 0, 32, 15, 0, 0, 0, 0, 0, 0, 73, 33, 11, 0, 0, 0, 0, 0, 0, 5, 114, 1, 0, 0, 0, 0, 51, 154, 20, 0, 0, 0, 0, 0, 154, 154, 1, 2, 1, 0, 0, 0, 154, 148, 18, 1, 0, 0, 0, 0, 0, 2, 154, 61, 0, 0, 0, 0, 5, 60, 154, 30, 0, 0, 0, 0, 34, 70, 6, 15, 3, 2, 1, 0, 14, 16, 2, 0, 0, 0, 0, 0, 0, 0, 154, 84, 0, 0, 0, 0, 0, 0, 154, 64, 0, 0, 0, 0, 0, 0, 6, 6, 1, 0, 1, 0, 0, 0]

But in Lowe paper it is stated that:

Therefore, we reduce the influence of large gradient magnitudes by thresholding the values in the unit feature vector to each be no larger than 0.2, and then renormalizing to unit length. This means that matching the magnitudes for large gradients is no longer as important, and that the distribution of orientations has greater emphasis. The value of 0.2 was determined experimentally using images containing differing illuminations for the same 3D objects.

So the numbers from the feature vector should be no larger than 0.2 value.

The question is, how these values have been converted in a Mat object?

解决方案

So the numbers from the feature vector should be no larger than 0.2 value.

No. The paper says that SIFT descriptors are:

  1. normalized (with L2 norm)
  2. truncated using 0.2 as a threshold (i.e. loop over the normalized values and truncate when appropriate)
  3. normalized again

So in theory any SIFT descriptor component is between [0, 1], even though in practice the effective range observed is smaller (see below).

The question is, how these values have been converted in a Mat object?

They are converted from floating-point values to unsigned char-s.

Here's the related section from OpenCV modules/nonfree/src/sift.cpp calcSIFTDescriptor method:

float nrm2 = 0;
len = d*d*n;
for( k = 0; k < len; k++ )
    nrm2 += dst[k]*dst[k];
float thr = std::sqrt(nrm2)*SIFT_DESCR_MAG_THR;
for( i = 0, nrm2 = 0; i < k; i++ )
{
    float val = std::min(dst[i], thr);
    dst[i] = val;
    nrm2 += val*val;
}
nrm2 = SIFT_INT_DESCR_FCTR/std::max(std::sqrt(nrm2), FLT_EPSILON);
for( k = 0; k < len; k++ )
{
    dst[k] = saturate_cast<uchar>(dst[k]*nrm2);
}

With:

static const float SIFT_INT_DESCR_FCTR = 512.f;

This is because classical SIFT implementations quantize the normalized floating point values into unsigned char integer through a 512 multiplying factor, which is equivalent to consider that any SIFT component varies between [0, 1/2], and thus avoid to loose precision trying to encode the full [0, 1] range.

这篇关于如何从OpenCV的SiftDescriptorExtractor转换描述符值?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆