LSH是否将向量转换为汉明距离的二元向量? [英] Is LSH about transforming vectors to binary vectors for hamming distance?

查看:282
本文介绍了LSH是否将向量转换为汉明距离的二元向量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我读了一些关于LSH的论文,我知道它用于解决近似的k-NN问题。我们可以将算法分为两部分:

I read some paper about LSH and I know that is used for solving the approximated k-NN problem. We can divide the algorithm in two parts:


  1. 给定中的向量D 任何值的尺寸(其中 D 很大),用一组 N (其中<$)翻译它c $ c> N<< D )哈希函数到 N 维度中的二进制向量。

  1. Given a vector in D dimensions (where D is big) of any value, translate it with a set of N (where N<<D) hash functions to a binary vector in N dimensions.

使用汉明距离,对从阶段1获得的给定二进制代码集应用一些搜索技术,以找到k-NN。

Using hamming distance, apply some search technique on the set of given binary codes obtained from phase 1 to find the k-NN.

关键点是使用XOR计算 N 维度中向量的汉明距离很快。

The keypoint is that computing the hamming distance for vectors in N dimensions is fast using XOR.

无论如何,我有两个问题:

Anyway, I have two questions:


  1. 如果我们使用二进制文件,仍然需要点1描述符,像ORB?由于ORB的描述符已经是二进制文件,我们使用汉明距离来比较它们,为什么我们应该执行第一个点呢?

  1. Point 1. is still necessary if we use a binary descriptor, like ORB? Since ORB's descriptors are already binaries and we use the hamming distance for comparing them, why we should perform the first point?

如何对图像所描述的图像进行转换筛?每个SIFT描述符是128位,每个图像由一组描述符描述。所以我们有矩阵 descX128 (其中 desc 是使用的描述符的数量),而LSH通常接受矢量作为输入。

How the conversion happens for images described by SIFT? Each SIFT descriptor is 128 bits and each image is described by a set of descriptors. So we have matrix descX128 (where desc is the number of descriptors used), while LSH usually accept as input a vector.


推荐答案

1)你可以绕过它,那么你将使用 D 维度,而不是 N 。其中 N<< d 。这意味着算法也必须适应 D 维度。

1) You can bypass it, but then you will work in D dimensions, not N as you say. where N << D. That means that the algorithm has to adapt to D dimensions as well.

2)

阅读来自openCV的SIFT



  1. 关键点描述符

现在创建关键点描述符。围绕
关键点的16x​​16邻域被采用。它分为16个4x4大小的子块。对于每个子块
,创建8个bin方向直方图。因此总共可以获得
128 bin值。它表示为形成
关键点描述符的向量。除此之外,还需要采取一些措施
来实现对光照变化,旋转等的稳健性。

Now keypoint descriptor is created. A 16x16 neighbourhood around the keypoint is taken. It is devided into 16 sub-blocks of 4x4 size. For each sub-block, 8 bin orientation histogram is created. So a total of 128 bin values are available. It is represented as a vector to form keypoint descriptor. In addition to this, several measures are taken to achieve robustness against illumination changes, rotation etc.

这是我的方法在我看来,希望这就足够了:

Here is how I have it in my mind, hope that will be enough:

LSH将点集作为输入, n 点,其中每个点位于 d 维度。

LSH takes as input a pointset, of n points, where every point lies in d dimensions.

因此,查询是一个点,在 d 维度,目标是找到其NN *

So, a query is a point, in d dimensions and the goal is to find its NN*.


  1. 现在每个点都代表一个图像描述符。所以,我们在
    数据集中有 n 图像。

查询,即还有一个点(即带有 d
坐标的矢量)代表另一个图像描述符。

The query, which is also a point (i.e. a vector with d coordinates), represents another image descriptor.

我们正在寻找与我们的数据集中的图像描述符匹配(即找到最近邻)
查询图像描述符。

We are seeking to match (i.e. to find the Nearest Neighbor) the query image descriptor with an image descriptor from our dataset.

因此,您所谈论的转换应用于向量中,不是矩阵。

So the conversion you are talking about is applied in a vector, not a matrix.

编辑

此外,来自我们的高维近似最近邻:kd广义随机森林论文,请参阅实验部分:

Moreover, from our High-dimensional approximate nearest neighbor: k-d Generalized Randomized Forests paper, see this in the Experiments section:


SIFT是一个128维向量,通过
局部梯度方向直方图描述局部图像补丁。

SIFT is a 128-dimensional vector that describes a local image patch by histograms of local gradient orientations.






* 邻近的固定半径问题

这篇关于LSH是否将向量转换为汉明距离的二元向量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆