相似矩阵 - >特征向量算法? [英] Similarity matrix -> feature vectors algorithm?
问题描述
如果我们有一组的M个字,并且知道每个单词对的含义提前的相似度(具有相似的A M×M矩阵),该算法可以我们使用使一个k维位向量对于每个字,以使每对词可以只是通过比较其矢量比较(例如得到向量的绝对差)?
If we have a set of M words, and know the similarity of the meaning of each pair of words in advance (have a M x M matrix of similarities), which algorithm can we use to make one k-dimensional bit vector for each word, so that each pair of words can be compared just by comparing their vectors (e.g. getting the absolute difference of vectors)?
我不知道这方面的问题是如何调用。如果我知道,这将是更容易找到之间的一堆类似的描述,它做别的事情的算法。
I don't know how this particular problem is called. If I knew, it would be much easier to find among a bunch of algorithms with similar descriptions, which do something else.
其他观察:
我觉得这个算法将有无,以生产一,在这种情况下想,副作用。如果,从矩阵,A字类似于字B和B是类似于C,但检测的低[A,C]相似性,计算的结果向量差应生产出高[A,C]相似。因此,我们将填补在基体中的previous差距 - 理顺与此算法的相似莫名其妙。但是,除了这个平滑,该目标是有结果尽可能接近到原始的数字,我们有以矩阵
I think this algorithm would have to produce one, in this case wanted, side effect. If, from the matrix, word A is similar to word B and B is similar to C, but there is low [A, C] similarity detected, the calculated result vectors difference should produce high [A, C] similarity as well. So, we would fill in the previous gaps in the matrix - smoothen the similarities with this algorithm somehow. But besides this smoothing, the goal is to have results as close as possible to the original numbers that we had in a matrix.
推荐答案
您可以做一个截断奇异值分解( SVD)找到最好的K-秩近似矩阵。我们的想法是分解基质为3个矩阵:U,西格玛和V使得U和V是正交和sigma为对角
You could do a truncated singular value decomposition (SVD) to find the best k-rank approximation to the matrix. The idea is the decompose the matrix into three matrices: U, sigma, and V such that U and V are orthonormal and sigma is diagonal.
通过截断掉不重要的奇异值,就可以实现 O(K * M)
的存储空间。
By truncating off unimportant singular values, you can achieve O(k*m)
storage space.
这篇关于相似矩阵 - >特征向量算法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!