有效地找到50k 2D坐标的n个最近邻居? [英] Efficiently find the n nearest neighbours of 50k 2D coordinates?

查看:83
本文介绍了有效地找到50k 2D坐标的n个最近邻居?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含经度和纬度的数组.该任务是查找所有坐标的5个最接近的坐标,而不必每次都遍历所有坐标.

I have an array with latitudes and longitudes. The task is to find the 5 nearest coordinates for all coordinates without looping through all coordinates every time.

推荐答案

有两种解决方案,具体取决于您的数据(您对此一无所知)以及确定的意愿.

There are a couple of solutions depending on your data (which you have told nothing about) and how sure you want to be.

  • 如果您的数据是均匀分布的,则可以在数据之上创建一个网格,并为该网格分配点.之后,您将为每个元素找出它属于哪个网格,并比较该网格(以及最接近的网格)中的距离.通过良好的网格选择并假设网格中平均有k个元素,这可以为您提供潜在的O(n * k^2)运行时间.请查看此答案以获取更多说明.
  • 对数据一无所知,您可以在其中构建 2-d树 O(n log n)时间,然后对数据库中的每个点询问最接近的点(您在O(logn)中要求总共n个点).因此,总复杂度为O(n log n )
  • 另一种方法是使用称为本地敏感度哈希的概率方法. Wiki页面太复杂了,甚至不知道这是什么,我很难阅读该页面.看看此描述可以更好地理解它.
  • @Gene建议使用 quadtree 的另一种方法(尚未听说过这种方法树,所以将其留空)
  • if your data is uniformly distributed, then you can create a grid on top of your data and assign points to the grid. After that for every element you find out which grid it belongs to and compare distances in that grid (and in the closest grids). With good grid selection and assuming there are k elements in the grid on average this can give you a potential O(n * k^2) running time. Look at this answer for more explanation.
  • knowing nothing about the data, you can construct a 2-d tree in O(n log n) time and then for every point in your database to ask what is the closest point to it (you ask this in O(logn) for a total n points). So the total complexity is O(n log n )
  • another approach is to use probabilistic approach called local sensitivity hashing. The wiki page is too convoluted and even knowing what is this, it was hard for me to read that page. Take a look at this description to better understand it.
  • @Gene suggested another approach by using a quadtree (have not heard about this kind of tree, so will just leave it empty)

因此,如您所见,此任务的复杂性可能会优于O(n^2).

So as you see it is possible to get better than O(n^2) complexity for this task.

所有方法都描述了如何找到与您要搜索的点最接近的点.很明显,找到最接近的点后,可以将其删除并找到另一个最接近的点,依此类推,直到找到5个最接近的点.

All the methods are describing how to find the closest point to a point you are searching for. It is obvious that after you found the closest point, you can remove it and find another closest point and so on till you found 5 closest point for your point.

这篇关于有效地找到50k 2D坐标的n个最近邻居?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆