包功能:如何创建查询直方图? [英] Bag of feature: how to create the query histogram?

查看:107
本文介绍了包功能:如何创建查询直方图?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试实现功能包模型.

I'm trying to implement the Bag of Features model.

给出属于初始数据集的描述符矩阵对象(代表图像),计算其直方图很容易,因为我们已经从k均值中知道了每个描述符向量属于哪个群集.

Given a descriptors matrix object (representing an image) belonging to the initial dataset, compute its histogram is easy, since we already know to which cluster each descriptor vector belongs to from k-means.

但是,如果我们要计算查询矩阵的直方图怎么办?我唯一想到的解决方案是计算每个矢量描述符到每个k簇质心的距离.

But what about if we want to compute the histogram of a query matrix? The only solution that crosses my mind is to compute the distance between each vector descriptor to each of the k cluster centroids.

这可能是低效的:假设k=100(即100个质心),那么我们有一个通过1000个SIFT描述符表示的查询图像,因此是一个矩阵1000x100.

This can be inefficient: supposing that k=100 (so 100 centroids), then we have an query image represented through 1000 SIFT descriptors, so a matrix 1000x100.

我们现在要做的是在128维中计算1000 * 100欧氏距离.这似乎效率很低.

What we have to do now is computing 1000 * 100 eucledian distances in 128 dimensions. This seems really inefficient.

如何解决这个问题?

注意::您能建议我一些解释了这一点的实现吗?

NOTE: can you suggest me some implementations where this point is explained?

注意:我知道LSH是一种解决方案(因为我们使用的是高维矢量),但是我不认为实际的实现会使用它.

NOTE: I know LSH is a solution (since we are using high-dim vectors), but I don't think that actual implementations use it.

更新:

我当时正在和我的一个同事聊天:使用分层聚类方法而不是经典的k均值,应该可以大大加快这一过程!可以肯定地说,如果我们有k个质心,并且具有层次结构簇,则只需要进行log(k)个比较即可找到最接近的质心,而不是k个比较?

I was talking with a collegue of mine: using a hierarchical cluster approach instead of classic k-means, should speed up the process so much! Is it correct to say that if we have k centroids, with an hierarchical cluster we have to do only log(k) comparisons in order to find the closest centroid instead of k comparisons?

推荐答案

对于一整套功能方法,您确实需要量化描述符.是的,如果您具有10000个要素和10000 * 100个距离的100个要素(除非您在此处使用索引). 进行比较以将数据库中每个图像的10000个特征与每个图像的10000个特征进行比较.听起来还是那么糟糕吗?

For a bag of features approach, you indeed need to quantize the descriptors. Yes, if you have 10000 features and 100 features that 10000*100 distances (unless you use an index here). Compare this to comparing each of the 10000 features to each of the 10000 features of each image in your database. Does it still sound that bad?

这篇关于包功能:如何创建查询直方图?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆