平衡KD-Tree:哪种方法更有效? [英] Balancing KD-Tree: Which approach is more efficient?

查看:141
本文介绍了平衡KD-Tree:哪种方法更有效?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试使用KD树平衡一组(百万+)3D点,并且有两种方法。

I'm trying to balance a set of (Million +) 3D points using a KD-tree and I have two ways of doing it.

方式1:


  1. 使用O(n)算法查找arraysize / 2的最大元素给定的轴并将其存储在当前节点中

  1. Use an O(n) algorithm to find the arraysize/2-th largest element along a given axis and store it at the current node

遍历向量中的所有元素,并将每个元素与我刚刚发现并放入的元素进行比较在newArray1中较小的那些,在newArray2中较大的那些

Iterate over all the elements in the vector and for each, compare them to the element I just found and put those smaller in newArray1, and those larger in newArray2

递归

方法2:


  1. 使用quicksort O(nlogn)对所有内容进行排序沿给定轴的数组中的元素,将元素放在arraysize / 2的位置,并将其存储在当前节点中。

  1. Use quicksort O(nlogn) to sort all the elements in the array along a given axis, take the element at position arraysize/2 and store it in the current node.

然后将所有索引0中的元素放入newArray1中的arraysize / 2-1,并将那些元素从arraysize / 2放入newArray2中的arraysize-1

Then put all the elements from index 0 to arraysize/2-1 in newArray1, and those from arraysize/2 to arraysize-1 in newArray2

递归

方法2似乎更优雅但是方法1似乎更快,因为中值搜索和迭代都为O(n),所以我得到O(2n),它只减小为O(n)。但是同时,即使方法2为O(nlogn)排序时间,也可以在固定时间内将数组拆分为2,但这是否可以弥补O(nlogn)的排序时间呢?

Way 2 seems more "elegant" but way 1 seems faster since the median search and the iterating are both O(n) so I get O(2n) which just reduces to O(n). But then at the same time, even though way 2 is O(nlogn) time to sort, splitting up the array into 2 can be done in constant time, but does it make up for the O(nlogn) time for sorting?

我该怎么办?还是有什至更好的方法来实现这一点,而我什至没有看到?

What should I do? Or is there an even better way to do this that I'm not even seeing?

推荐答案

方法3: p>

How about Way 3:


  1. 使用O(n)算法(例如QuickSelect)来确保位置length / 2处的元素是正确的元素,所有元素之前更少,所有之后都比它大(没有对它们进行完全排序!)-无论如何,这可能是您在方法1步骤1中使用的算法...

  1. Use an O(n) algorithm such as QuickSelect to ensure that the element at position length/2 is the correct element, all elements before are less, and all afterwards are larger than it (without sorting them completely!) - this is probably the algorithm you used in your Way 1 step 1 anyway...

递归到每一半(中间元素除外),并在下一轴重复。

Recurse into each half (except middle element) and repeat with next axis.

请注意,实际上不需要制作节点对象。您实际上可以将树排列成较大的阵列。搜索时,从第一个轴的长度/ 2开始。

Note that you actually do not need to make "node" objects. You can actually keep the tree in a large array. When searching, start at length/2 with the first axis.

我已经看到ELKI使用了这个技巧。它使用很少的内存和代码,这使得树非常快。

I've seen this trick being used by ELKI. It uses very little memory and code, which makes the tree quite fast.

这篇关于平衡KD-Tree:哪种方法更有效?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆