最近邻搜索 kdTree [英] nearest neighbour search kdTree

查看:25
本文介绍了最近邻搜索 kdTree的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

N 个点的列表 [(x_1,y_1), (x_2,y_2), ... ] 我试图找到每个点的最近邻居基于距离的点.我的数据集太大而无法使用蛮力方法,因此 KDtree 似乎是最好的.

To a list of N points [(x_1,y_1), (x_2,y_2), ... ] I am trying to find the nearest neighbours to each point based on distance. My dataset is too large to use a brute force approach so a KDtree seems best.

我看到 sklearn.neighbors.KDTree 可以找到最近的邻居,而不是从头开始实现.这可以用于查找每个粒子的最近邻居,即返回一个dim(N)列表?

Rather than implement one from scratch I see that sklearn.neighbors.KDTree can find the nearest neighbours. Can this be used to find the nearest neighbours of each particle, i.e return a dim(N) list?

推荐答案

这个问题很宽泛,缺少细节.目前尚不清楚您尝试了什么、您的数据是什么样的以及最近邻居是什么(身份?).

This question is very broad and missing details. It's unclear what you did try, how your data looks like and what a nearest-neighbor is (identity?).

假设您对身份(距离为 0)不感兴趣,您可以查询两个最近的邻居并删除第一列.这可能是这里最简单的方法.

Assuming you are not interested in the identity (with distance 0), you can query the two nearest-neighbors and drop the first column. This is probably the easiest approach here.

 import numpy as np
 from sklearn.neighbors import KDTree
 np.random.seed(0)
 X = np.random.random((5, 2))  # 5 points in 2 dimensions
 tree = KDTree(X)
 nearest_dist, nearest_ind = tree.query(X, k=2)  # k=2 nearest neighbors where k1 = identity
 print(X)
 print(nearest_dist[:, 1])    # drop id; assumes sorted -> see args!
 print(nearest_ind[:, 1])     # drop id 

输出

 [[ 0.5488135   0.71518937]
  [ 0.60276338  0.54488318]
  [ 0.4236548   0.64589411]
  [ 0.43758721  0.891773  ]
  [ 0.96366276  0.38344152]]
 [ 0.14306129  0.1786471   0.14306129  0.20869372  0.39536284]
 [2 0 0 0 1]

这篇关于最近邻搜索 kdTree的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆