查找矩阵内最接近/相似的值(向量) [英] Find closest/similar value(vector) inside a matrix
问题描述
假设我有以下numpy矩阵(简化):
let's say I have the following numpy matrix (simplified):
matrix = np.array([[1, 1],
[2, 2],
[5, 5],
[6, 6]]
)
现在我想从最接近搜索"向量的矩阵中获取向量:
And now I want to get the vector from the matrix closest to a "search" vector:
search_vec = np.array([3, 3])
我所做的是以下事情:
min_dist = None
result_vec = None
for ref_vec in matrix:
distance = np.linalg.norm(search_vec-ref_vec)
distance = abs(distance)
print(ref_vec, distance)
if min_dist == None or min_dist > distance:
min_dist = distance
result_vec = ref_vec
结果有效,但是是否有本机的numpy解决方案来提高效率? 我的问题是,矩阵越大,整个过程就越慢. 还有其他解决方案可以更优雅,更有效地解决这些问题吗?
The result works, but is there a native numpy solution to do it more efficient? My problem is, that the bigger the matrix becomes, the slower the entire process will be. Are there other solutions that handle these problems in a more elegant and efficient way?
推荐答案
方法1
我们可以将 Cython-powered kd-tree
用于快速的最近邻居查找,在内存和性能方面都非常有效-
We can use Cython-powered kd-tree
for quick nearest-neighbor lookup, which is very efficient both memory-wise and with performance -
In [276]: from scipy.spatial import cKDTree
In [277]: matrix[cKDTree(matrix).query(search_vec, k=1)[1]]
Out[277]: array([2, 2])
方法2
使用 SciPy's cdist
-
In [286]: from scipy.spatial.distance import cdist
In [287]: matrix[cdist(matrix, np.atleast_2d(search_vec)).argmin()]
Out[287]: array([2, 2])
方法3
With Scikit-learn's
Nearest Neighbors -
from sklearn.neighbors import NearestNeighbors
nbrs = NearestNeighbors(n_neighbors=1).fit(matrix)
closest_vec = matrix[nbrs.kneighbors(np.atleast_2d(search_vec))[1][0,0]]
方法4
With Scikit-learn's
kdtree -
from sklearn.neighbors import KDTree
kdt = KDTree(matrix, metric='euclidean')
cv = matrix[kdt.query(np.atleast_2d(search_vec), k=1, return_distance=False)[0,0]]
方法5
从 eucl_dist
包中(免责声明:我是它的作者)并遵循 wiki contents
,我们可以利用matrix-multiplication
-
M = matrix.dot(search_vec)
d = np.einsum('ij,ij->i',matrix,matrix) + np.inner(search_vec,search_vec) -2*M
closest_vec = matrix[d.argmin()]
这篇关于查找矩阵内最接近/相似的值(向量)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!