从距离矩阵开始查找K个最近的邻居 [英] Find K nearest neighbors, starting from a distance matrix

查看:116
本文介绍了从距离矩阵开始查找K个最近的邻居的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在寻找一个经过优化的函数,该函数接受n X n距离矩阵并返回n X k矩阵,该矩阵的索引是第i行中第i个数据点的最近邻居的索引.

I'm looking for a well-optimized function that accepts an n X n distance matrix and returns an n X k matrix with the indices of the k nearest neighbors of the ith datapoint in the ith row.

我找到了种类繁多的R软件包,它们可以让您执行KNN,但是它们似乎都包含距离计算以及同一函数中的排序算法.特别是,对于大多数例程,主要参数是原始数据矩阵,而不是距离矩阵.就我而言,我在混合变量类型上使用了非标准距离,因此我需要从距离计算中分离出排序问题.

I find a gazillion different R packages that let you do KNN, but they all seem to include the distance computations along with the sorting algorithm within the same function. In particular, for most routines the main argument is the original data matrix, not a distance matrix. In my case, I'm using a nonstandard distance on mixed variable types, so I need to separate the sorting problem from the distance computations.

这并不是一个令人生畏的问题-我显然可以在循环中使用order函数来获取我想要的东西(请参阅下面的解决方案),但这远非最佳选择.例如,当k较小(小于11)时,带有partial = 1:ksort函数的运行速度要快得多,但不幸的是,它仅返回排序后的值,而不返回所需的索引.

This is not exactly a daunting problem -- I obviously could just use the order function inside a loop to get what I want (see my solution below), but this is far from optimal. For example, the sort function with partial = 1:k when k is small (less than 11) goes much faster, but unfortunately returns only sorted values rather than the desired indices.

推荐答案

尝试使用

Try to use FastKNN CRAN package (although it is not well documented). It offers k.nearest.neighbors function where an arbitrary distance matrix can be given. Below you have an example that computes the matrix you need.

# arbitrary data
train <- matrix(sample(c("a","b","c"),12,replace=TRUE), ncol=2) # n x 2
n = dim(train)[1]
distMatrix <- matrix(runif(n^2,0,1),ncol=n) # n x n

# matrix of neighbours
k=3
nn = matrix(0,n,k) # n x k
for (i in 1:n)
   nn[i,] = k.nearest.neighbors(i, distMatrix, k = k)

注意:您始终可以检查Cran软件包列表中的Ctrl + F ='knn' 相关功能: https://cran.r-project.org/web/packages/available_packages_by_name.html

Notice: You can always check Cran packages list for Ctrl+F='knn' related functions: https://cran.r-project.org/web/packages/available_packages_by_name.html

这篇关于从距离矩阵开始查找K个最近的邻居的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆