k * [英] k* reproduction values?
问题描述
我正在阅读 PQ的第II.A第3页的产品量化对于NNS ,其表示:
...所有子量化器具有相同的有限数量k *的再现值。在这种情况下,质心的数量是(k *)^ m
其中 m
是子向量的数量。
但是,我根本没有得到k *!我的意思是在矢量量化我们分配每个矢量到k个质心。在产生量化中,我们将每个子向量分配给k个质心。
我认为 k *
是每个子空间中的质心数, k
是整个空间中的质心数。 例如,如果数据是2d,如(x,y)
,我们将每个维度视为一个子空间,并做kmeans说 k * = 3
,我们将在每个子空间中获得3个质心, {x1,x2,x3}
和 {y1, y2,y3}
。
然后会有 3 ^ 2 = 9
在整个空间中的可能质心,其是 * (x1,y1)
,(x1,y2),
(x1,y3)
,(x2,y1)
...
以这种方式,我们可以使用少量内存获得大量的质心( 2 ^ 64
)我们不必存储所有 k * ^ m
centorids,我们只需要存储 k *
子空间。
编辑:
在上面的示例中,子空间数量 m = 2
,每个子空间中的质心数 k * = 3
,整个子空间的质心数 k = 3 ^ 2
,每个子空间的维数 D * = 1
,要存储的浮点数 mD * k * = Dk * = 6
。
* x和y的笛卡尔乘积
I am reading about Product Quantization, from section II.A page 3 of PQ for NNS, that says:
..all subquantizers have the same finite number k* of reproduction values. In that case the number of centroids is (k*)^m
where m
is the number of subvectors.
However, I do not get k* at all! I mean in vector quantization we assign every vector to k centroids. In produce quantization, we assign every subvector to k centroids. How did k* come into play?
I think k*
is the number of centroids in each subspace, and k
is the number of centroids in the whole space.
For example if the data is 2d, like (x, y)
, and we treat each dimension as a subspace, and do kmeans with say k*=3
respectively, we'll get 3 centroids in each subspace, {x1, x2, x3}
and {y1, y2, y3}
.
Then there'll be 3^2=9
possible centroids in the whole space, which are* (x1, y1)
, (x1, y2)
, (x1, y3)
, (x2, y1)
...
In this way we can get a large number of centroids (2^64
in the paper) using a small amount of memory, because we don't have to store all k*^m
centorids, we only need to store k*
centroids in each subspace.
Edit:
In above the example, the number of subspaces m=2
, number of centroids in each subspace k*=3
, number of centroids the whole subspace k=3^2
, number of dimensions of each subspace D*=1
, number of floating points to store mD*k*=Dk*=6
.
*the cartesian product of x and y
这篇关于k *的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!