查找矩阵内最接近/相似的值(向量) [英] Find closest/similar value(vector) inside a matrix

查看:446
本文介绍了查找矩阵内最接近/相似的值(向量)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我有以下numpy矩阵(简化):

let's say I have the following numpy matrix (simplified):

matrix = np.array([[1, 1],
               [2, 2],
               [5, 5],
               [6, 6]]
              )

现在我想从最接近搜索"向量的矩阵中获取向量:

And now I want to get the vector from the matrix closest to a "search" vector:

search_vec = np.array([3, 3])

我所做的是以下事情:

min_dist = None
result_vec = None
for ref_vec in matrix:
    distance = np.linalg.norm(search_vec-ref_vec)
    distance = abs(distance)
    print(ref_vec, distance)
    if min_dist == None or min_dist > distance:
        min_dist = distance
        result_vec = ref_vec

结果有效,但是是否有本机的numpy解决方案来提高效率? 我的问题是,矩阵越大,整个过程就越慢. 还有其他解决方案可以更优雅,更有效地解决这些问题吗?

The result works, but is there a native numpy solution to do it more efficient? My problem is, that the bigger the matrix becomes, the slower the entire process will be. Are there other solutions that handle these problems in a more elegant and efficient way?

推荐答案

方法1

我们可以将 Cython-powered kd-tree用于快速的最近邻居查找,在内存和性能方面都非常有效-

We can use Cython-powered kd-tree for quick nearest-neighbor lookup, which is very efficient both memory-wise and with performance -

In [276]: from scipy.spatial import cKDTree

In [277]: matrix[cKDTree(matrix).query(search_vec, k=1)[1]]
Out[277]: array([2, 2])

方法2

使用 SciPy's cdist -

In [286]: from scipy.spatial.distance import cdist

In [287]: matrix[cdist(matrix, np.atleast_2d(search_vec)).argmin()]
Out[287]: array([2, 2])

方法3

使用 Scikit-learn's最近的邻居-

With Scikit-learn's Nearest Neighbors -

from sklearn.neighbors import NearestNeighbors

nbrs = NearestNeighbors(n_neighbors=1).fit(matrix)
closest_vec = matrix[nbrs.kneighbors(np.atleast_2d(search_vec))[1][0,0]]

方法4

使用 Scikit-learn's kdtree -

With Scikit-learn's kdtree -

from sklearn.neighbors import KDTree
kdt = KDTree(matrix, metric='euclidean')
cv = matrix[kdt.query(np.atleast_2d(search_vec), k=1, return_distance=False)[0,0]]

方法5

eucl_dist 包中(免责声明:我是它的作者)并遵循 wiki contents ,我们可以利用matrix-multiplication-

M = matrix.dot(search_vec)
d = np.einsum('ij,ij->i',matrix,matrix) + np.inner(search_vec,search_vec) -2*M
closest_vec = matrix[d.argmin()]

这篇关于查找矩阵内最接近/相似的值(向量)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆