平衡KD-Tree:哪种方法更有效? [英] Balancing KD-Tree: Which approach is more efficient?
问题描述
我正在尝试使用KD树平衡一组(百万+)3D点,并且有两种方法。
I'm trying to balance a set of (Million +) 3D points using a KD-tree and I have two ways of doing it.
方式1:
-
使用O(n)算法查找arraysize / 2的最大元素给定的轴并将其存储在当前节点中
Use an O(n) algorithm to find the arraysize/2-th largest element along a given axis and store it at the current node
遍历向量中的所有元素,并将每个元素与我刚刚发现并放入的元素进行比较在newArray1中较小的那些,在newArray2中较大的那些
Iterate over all the elements in the vector and for each, compare them to the element I just found and put those smaller in newArray1, and those larger in newArray2
递归
方法2:
-
使用quicksort O(nlogn)对所有内容进行排序沿给定轴的数组中的元素,将元素放在arraysize / 2的位置,并将其存储在当前节点中。
Use quicksort O(nlogn) to sort all the elements in the array along a given axis, take the element at position arraysize/2 and store it in the current node.
然后将所有索引0中的元素放入newArray1中的arraysize / 2-1,并将那些元素从arraysize / 2放入newArray2中的arraysize-1
Then put all the elements from index 0 to arraysize/2-1 in newArray1, and those from arraysize/2 to arraysize-1 in newArray2
递归
方法2似乎更优雅但是方法1似乎更快,因为中值搜索和迭代都为O(n),所以我得到O(2n),它只减小为O(n)。但是同时,即使方法2为O(nlogn)排序时间,也可以在固定时间内将数组拆分为2,但这是否可以弥补O(nlogn)的排序时间呢?
Way 2 seems more "elegant" but way 1 seems faster since the median search and the iterating are both O(n) so I get O(2n) which just reduces to O(n). But then at the same time, even though way 2 is O(nlogn) time to sort, splitting up the array into 2 can be done in constant time, but does it make up for the O(nlogn) time for sorting?
我该怎么办?还是有什至更好的方法来实现这一点,而我什至没有看到?
What should I do? Or is there an even better way to do this that I'm not even seeing?
推荐答案
方法3: p>
How about Way 3:
-
使用O(n)算法(例如QuickSelect)来确保位置length / 2处的元素是正确的元素,所有元素之前更少,所有之后都比它大(没有对它们进行完全排序!)-无论如何,这可能是您在方法1步骤1中使用的算法...
Use an O(n) algorithm such as QuickSelect to ensure that the element at position length/2 is the correct element, all elements before are less, and all afterwards are larger than it (without sorting them completely!) - this is probably the algorithm you used in your Way 1 step 1 anyway...
递归到每一半(中间元素除外),并在下一轴重复。
Recurse into each half (except middle element) and repeat with next axis.
请注意,实际上不需要制作节点对象。您实际上可以将树排列成较大的阵列。搜索时,从第一个轴的长度/ 2开始。
Note that you actually do not need to make "node" objects. You can actually keep the tree in a large array. When searching, start at length/2 with the first axis.
我已经看到ELKI使用了这个技巧。它使用很少的内存和代码,这使得树非常快。
I've seen this trick being used by ELKI. It uses very little memory and code, which makes the tree quite fast.
这篇关于平衡KD-Tree:哪种方法更有效?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!