寻找一组中距离另一组最远的点 [英] Finding the farthest point in one set from another set

查看:34
本文介绍了寻找一组中距离另一组最远的点的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的目标是更有效地实现算法 在这个问题中.

My goal is a more efficient implementation of the algorithm posed in this question.

考虑两组点(在 N 空间中.对于 RGB 颜色空间的示例情况,3-空间,而 1-space 2-空间的解决方案仅在距离计算上有所不同).如何找到第一组中离第二组最近邻最远的点?

Consider two sets of points (in N-space. 3-space for the example case of RGB colorspace, while a solution for 1-space 2-space differs only in the distance calculation). How do you find the point in the first set that is the farthest from its nearest neighbor in the second set?

在一个 1 空间的例子中,给定集合 A:{2,4,6,8} 和 B:{1,3,5},答案是8,因为 8 与 5(B 中的最近邻居)相距 3 个单位,而 A 的所有其他成员与 B 中的最近邻居仅相距 1 个单位.1-空间过于简化,因为排序与距离有关在某种程度上,它不是在更高的维度上.

In a 1-space example, given the sets A:{2,4,6,8} and B:{1,3,5}, the answer would be 8, as 8 is 3 units away from 5 (its nearest neighbor in B) while all other members of A are just 1 unit away from their nearest neighbor in B. edit: 1-space is overly simplified, as sorting is related to distance in a way that it is not in higher dimensions.

源问题中的解决方案涉及对一组中的每个点进行蛮力比较(所有 R、G、B,其中 512>=R+G+B>=256 和 R%4=0 和 G%4=0 和 B%4=0) 到另一组 (colorTable) 中的每个点.为了这个问题,忽略第一个集合是以编程方式详细说明的,而不是像第二个集合那样作为存储列表迭代.

The solution in the source question involves a brute force comparison of every point in one set (all R,G,B where 512>=R+G+B>=256 and R%4=0 and G%4=0 and B%4=0) to every point in the other set (colorTable). Ignore, for the sake of this question, that the first set is elaborated programmatically instead of iterated over as a stored list like the second set.

推荐答案

首先你需要在另一个集合中找到每个元素的最近邻.

First you need to find every element's nearest neighbor in the other set.

要有效地做到这一点,您需要一个最近邻算法.就我个人而言,我会实现一个 kd-tree 只是因为我过去在我的算法中做过它类,这是相当简单的.另一个可行的替代方案是 R-tree.

To do this efficiently you need a nearest neighbor algorithm. Personally I would implement a kd-tree just because I've done it in the past in my algorithm class and it was fairly straightforward. Another viable alternative is an R-tree.

对最小集合中的每个元素执行一次.(从最小到大添加一个元素并运行该算法以找到其最近的邻居.)

Do this once for each element in the smallest set. (Add one element from the smallest to larger one and run the algorithm to find its nearest neighbor.)

由此您应该能够获得每个元素的最近邻居列表.

From this you should be able to get a list of nearest neighbors for each element.

在查找最近邻对时,将它们保存在一个排序的数据结构中,该结构具有快速加法和快速 getMax 方法,例如 ,按欧几里得距离排序.

While finding the pairs of nearest neighbors, keep them in a sorted data structure which has a fast addition method and a fast getMax method, such as a heap, sorted by Euclidean distance.

然后,一旦完成,只需询问堆的最大值.

Then, once you're done simply ask the heap for the max.

此操作的运行时间细分如下:

The run time for this breaks down as follows:

N = 较小集合的大小
M = 较大集合的大小

N = size of smaller set
M = size of the larger set

  • N * O(log M + 1) 用于所有 kd 树最近邻检查.
  • N * O(1) 用于在将其添加到堆之前计算欧几里得距离.
  • N * O(log N) 用于将这些对添加到堆中.
  • O(1) 得到最终答案 :D

所以最终整个算法是O(N*log M).

So in the end the whole algorithm is O(N*log M).

如果你不关心每对的顺序,你可以通过只保留目前找到的最大值来节省一些时间和空间.

If you don't care about the order of each pair you can save a bit of time and space by only keeping the max found so far.

*免责声明:这一切都假设您不会使用大量的维度,并且您的元素遵循大部分随机分布.

*Disclaimer: This all assumes you won't be using an enormously high number of dimensions and that your elements follow a mostly random distribution.

这篇关于寻找一组中距离另一组最远的点的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆