相似矩阵 - >特征向量算法? [英] Similarity matrix -> feature vectors algorithm?

查看:181
本文介绍了相似矩阵 - >特征向量算法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果我们有一组的M个字,并且知道每个单词对的含义提前的相似度(具有相似的A M×M矩阵),该算法可以我们使用使一个k维位向量对于每个字,以使每对词可以只是通过比较其矢量比较(例如得到向量的绝对差)?

If we have a set of M words, and know the similarity of the meaning of each pair of words in advance (have a M x M matrix of similarities), which algorithm can we use to make one k-dimensional bit vector for each word, so that each pair of words can be compared just by comparing their vectors (e.g. getting the absolute difference of vectors)?

我不知道这方面的问题是如何调用。如果我知道,这将是更容易找到之间的一堆类似的描述,它做别的事情的算法。

I don't know how this particular problem is called. If I knew, it would be much easier to find among a bunch of algorithms with similar descriptions, which do something else.

其他观察:

我觉得这个算法将有无,以生产一,在这种情况下想,副作用。如果,从矩阵,A字类似于字B和B是类似于C,但检测的低[A,C]相似性,计算的结果向量差应生产出高[A,C]相似。因此,我们将填补在基体中的previous差距 - 理顺与此算法的相似莫名其妙。但是,除了这个平滑,该目标是有结果尽可能接近到原始的数字,我们有以矩阵

I think this algorithm would have to produce one, in this case wanted, side effect. If, from the matrix, word A is similar to word B and B is similar to C, but there is low [A, C] similarity detected, the calculated result vectors difference should produce high [A, C] similarity as well. So, we would fill in the previous gaps in the matrix - smoothen the similarities with this algorithm somehow. But besides this smoothing, the goal is to have results as close as possible to the original numbers that we had in a matrix.

推荐答案

您可以做一个截断奇异值分解( SVD)找到最好的K-秩近似矩阵。我们的想法是分解基质为3个矩阵:U,西格玛和V使得U和V是正​​交和sigma为对角

You could do a truncated singular value decomposition (SVD) to find the best k-rank approximation to the matrix. The idea is the decompose the matrix into three matrices: U, sigma, and V such that U and V are orthonormal and sigma is diagonal.

通过截断掉不重要的奇异值,就可以实现 O(K * M)的存储空间。

By truncating off unimportant singular values, you can achieve O(k*m) storage space.

这篇关于相似矩阵 - >特征向量算法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆