需要建议筛选特征 - 有这样的东西作为一个好的功能吗? [英] need advise on sift feature - is there such thing as a good feature?

查看:281
本文介绍了需要建议筛选特征 - 有这样的东西作为一个好的功能吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图vlfeat,从图像数据库获得大量的功能,我用平均精度(MAp)的地面实况测试。总的来说,我大约有40%。我看到一些论文得到更高的MAp,而使用非常类似于我的技术;标准的词袋。



我目前正在寻找一个答案,以获得更高的MAp为标准的word技术。虽然我看到有其他实现,如SURF和什么不,让我们坚持标准的Lowe的SIFT和这个问题的标准袋字。



是这个,我看到vl_sift得到阈值,允许你更严格的特征选择。目前,我理解,达到更高的阈值可能会净你更小,更有意义的好功能列表,并可能减少一些噪声功能。 良好的功能意味着,如果相同的图片具有不同的变化,也会在其他图片上检测到非常相似的功能



我们应该去这个阈值吗?有时,我看到一个图像根本没有返回任何具有更高阈值的功能。起初,我一直在想着继续调整阈值,直到我得到更好的MAp。但是,我认为,继续调整只是为了找到相应的数据库的最佳MAp是一个坏主意。所以我的问题是:


  1. 调整阈值可能会减少功能的数量,增加阈值总是返回较小的数字, ?


  2. 有什么更好的方法来获得良好的功能?

    可以提高获得良好特征的速度?



解决方案

进入一些文件,以应对帕斯卡的挑战,近年来。他们似乎给我的印象是标准的特征检测方法不能很好地与袋词技术。这是有意义的,当你考虑它 - BoW通过拉在一起很多弱,通常不相关的功能。它不是检测特定对象,而是识别对象和场景的类。



因此,我们看到有人使用密集网格甚至随机点作为其特征。根据经验,对哈里斯角,LoG,SIFT,MSER或任何类似方法使用这些方法之一,对性能有很大的积极影响。



问题直接:


  1. 是的。从 SIFT api


    通过消除那些可能不稳定的关键点,可以进一步细化关键点,因为它们选择在图像边缘附近,而不是图像块,或者在具有低对比度的图像结构上找到。过滤由以下控制:

    峰值阈值。这是接受关键点的最小对比度。它是通过由vl_sift_set_peak_thresh()配置SIFT过滤器对象来设置的。

    边缘阈值这是边缘抑制阈值。它是通过vl_sift_set_edge_thresh()配置SIFT过滤器对象来设置的。


    你可以看到两个阈值在'检测器参数部分此处


  2. <研究表明,从场景中密集选择的特征产生比使用更多智能方法(例如,SIFT,Harris,MSER)选择的那些更具描述性的词语。请使用vl_feat的 DSIFT或PHOW 实施方案,尝试使用您的Bag of Words。


  3. 在一组密集的特征点之后,您可以看到性能的大幅提升(假设您的单词选择和分类步骤调整良好)这个领域的最大突破似乎是空间金字塔方法。这增加了为一个图像产生的词的数量,但提供了一个位置方面的功能 - 一些固有的缺乏Bag of Words。然后,确保你的参数调整好(你正在使用的特征描述符(SIFT,HOG,SURF等),你的词汇表中有多少词,你使用ect的分类器)然后..你在活跃的研究土地。 Enjoy =)



I am trying out vlfeat, got huge amount of features from an image database, and I am testing with the ground truth for mean average precision (MAp). Overall, I got roughly 40%. I see that some of the papers got higher MAp, while using techniques very similar to mine; the standard bag of word.

I am currently looking for an answer for obtaining higher MAp for the standard bag of word technique. While I see that there are other implementation such as SURF and what not, let's stick to the standard Lowe's SIFT and the standard bag of word in this question.

So the thing is this, I see that vl_sift got thresholding to allow you to be more strict on feature selection. Currently, I understand that going for higher threshold might net you smaller and more meaningful "good" features list, and possibly reduce some noisy features. "Good" features mean, given the same images with different variation, very similar features are also detected on other images.

However, how high should we go for this thresholding? Sometimes, I see that an image returns no features at all with higher threshold. At first, I was thinking of keep on adjusting the threshold, until I get better MAp. But again, I think it's a bad idea to keep on adjusting just to find the best MAp for the respective database. So my questions are:

  1. While adjusting threshold may decrease numbers of features, does increasing threshold always return a lesser number yet better features?

  2. Are there better approaches to obtain the good features?

  3. What are other factors that can increase the rate of obtaining good features?

解决方案

Have a look into some of the papers put out in response to the Pascal challenge in recent years. The impression they seem to give me is that standard 'feature detection' methods don't work very well with the Bag of Words technique. This makes sense when you think about it - BoW works by pulling together lots of weak, often unrelated features. It's less about detecting a specific object, but instead recognizing classes of objects and scenes. As such, putting too much emphasis on normal 'key features' can harm more than help.

As such, we see folks using dense grids and even random points as their features. From experience, using one of these methods over Harris corners, LoG, SIFT, MSER, or any of the like, has a great positive impact on performance.

To answer your questions directly:

  1. Yes. From the SIFT api:

    Keypoints are further refined by eliminating those that are likely to be unstable, either because they are selected nearby an image edge, rather than an image blob, or are found on image structures with low contrast. Filtering is controlled by the follow:
    Peak threshold. This is the minimum amount of contrast to accept a keypoint. It is set by configuring the SIFT filter object by vl_sift_set_peak_thresh().
    Edge threshold. This is the edge rejection threshold. It is set by configuring the SIFT filter object by vl_sift_set_edge_thresh().

    You can see examples of the two thresholds in action in the 'Detector parameters' section here.

  2. Research suggests features densely selected from the scene yield more descriptive 'words' than those selected using more 'intelligent' methods (eg: SIFT, Harris, MSER). Try your Bag of Words pipeline with vl_feat's DSIFT or PHOW implementation. You should see a great improvement in performance (assuming your 'word' selection and classification steps are tuned well).

  3. After a dense set of feature points, the biggest breakthrough in this field seems to have been the 'Spatial Pyramid' approach. This increases the number of words produced for an image, but provides a location aspect to the features - something inherently lacking in Bag of Words. After that, make sure your parameters are well tuned (which feature descriptor you're using (SIFT, HOG, SURF, etc), how many words are in your vocabulary, what classifier are you using ect.) Then.. you're in active research land. Enjoy =)

这篇关于需要建议筛选特征 - 有这样的东西作为一个好的功能吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆