优先搜索树混乱 [英] Priority Search Tree confusion
问题描述
我找到的唯一合理的幻灯片集是此,如第15页所述,用于构建:
- 将所有点按其x坐标值排序并将它们存储在平衡二叉树(即范围树)的
叶节点中
- 从根开始,每个节点都在其子树中包含最大点尚未存储在树中较浅深度处的y坐标的值
;如果不存在这样的节点,则node
为空
我实现了范围树就在之前,因此根据我在此处使用的示例,现在(对于优先搜索树):
7
- ---- 0 ---------------------
| |
6 6
---- -3 ---- ---- 4 ------
| | | |
4 2 1 3
---- -4--2 --- 3 --- 5
| | / \ | | / \
0(-3,4)(-2,2)(0,7)NIL(4,-4)(5,3)(6,-1)
-5 2
/ \ / \
(-5,6)(-4,0)(2,1)(3,6)
此处,每个内部节点的形式为:
mid_x
max y
我构成的范围查询是(-inf,1)x (-0.5,5),即(x_low,x_high)x(y_low,y_high)。
- 所以,我们从根开始,我检查x(= 0)是否在(-inf,1)和
如果7>(-0.5,5)中。因此,我在两个孩子中都进行了学习,但是为了简化
,让我跟随左边的一个(在所有情况下,从现在开始)。 - 我检查-3是否为x范围,如果6大于或等于查询y范围的
上限(即5)。是的,所以我
继续。 - 下一个级别相同,因此我们进入下一个级别,现在请
专注于此内部节点。它的最大值ya为0,表示子树的最大值y为
,这是不正确的(左子级为
(-5,6)) * 。
在构建/搜索过程中我缺少什么?
换句话说:
检查树的最左边的分支;它具有 max_y
值(引号中的第二个项目符号),7、6、4、0。
但是那个值不是表示内部节点下的子树中存储的最大y值的值吗?如果是这样,对于 0
并指向(-5,6)
(未使用6,因为它在第二级中使用)。
* 我摆出的特定查询可能不会因此受到损坏,但另一个可能会损坏
sub>您最后的逻辑实际上仍然是正确的。当您点击标记为(6,-3)的节点时,应该已经获得(-5,6)值。
现在,我不是数学专业。但是总的想法是这样。在实施的优先搜索树中,您实际上是在两个单独的条件上进行搜索。
对于x,这是范围的简单二进制搜索。
对于y,您实际上是在将其作为优先级树进行搜索(对于y:inf或-inf:y的搜索很好,这取决于您使用的是max还是min。)
请注意,在第15页的底部,它指出树适合半无限范围(一个为无限)。继续阅读下去,您将看到树是如何针对y的半无限范围进行优化的。
简而言之,由于树是使用x作为二进制搜索结构而构建的和y为优先级(使用最大剩余值),最佳搜索为[x1:x2],[y1:inf]。
树中的每个节点基本上都存储2件事。
1. x的中点(或左侧树中的max-x,并根据> x校验决定向左或向右遍历)。
2.子树中y值最高的POINT尚未添加到前一个。
搜索算法本质上是这样的。使用[x1:x2],[y1:inf]
标准从根开始。1.检查分配给节点的POINT的y值。如果y> y1,则转到2,否则停止向下遍历(由于分配给该节点的POINT具有最大的y值,如果检查失败,则其下没有其他节点可以满足[y1:inf]。 b 2.检查该点的x值是否在[x1:x2]的范围内,如果是,则将该点包括在输出中,无论是否包括该点,都继续执行步骤3。节点的中点值(我们将其称为mx)。如果mx在[x1:x2]的范围内,则遍历两条路径;如果mx <[x1:x2],则向左遍历;否则,向右遍历。
编辑非常非常长的文本:
让我们遍历您的示例。注释使用字母标记每个点(点的名称),每个节点现在具有该点的名称格式,在括号中为y值,在其下方为中档 x。 ,带有'*'的表示已将它们分配到树的某处,并且应该
7(E)
------ 0- -------------------
| |
A(6)G(6)
----- -3 --------- 4 --------
| | | |
C(4)D(2)F(1)I(3)
---- -4--2 --- 3 --- 5
| | / \ | | / \
B(0)C *(-3,4)D *(-2,2)E *(0,7)NIL H(4,-4)I *(5,3)J (6,-1)
-5 2
/ \ / \
A *(-5,6)B *(-4,0)F *(2,1) G *(3,6)
让我们运行示例查询[-2:4] [1 :inf](或任何y> = 1)
- 从根开始,我们看到点E。
- 点E(7)的y是否等于1?是。
- [-2:4]中的点E(0)的x是吗?是
- 将E添加到输出中。
- 中点(也是0)是否在范围内?是
- 从具有E的节点开始,需要遍历两侧。
- 首先,让我们向左走,我们看到点A。
- 点A(6)的y是否等于1?是。
- [-2:4]中的点A(-5)的x是吗?否。
- 跳过A,但继续遍历(仅y检查停止遍历)。
- 是A的中点,范围为[ -2:4]?不,它在左侧。
- 由于-3< [-2:4],我们只需要右移。现在我们看到点D。
- 点D(2)的y是否等于1?没有!跳过该点并停止向下移动,因为D下没有其他点我们没有输出(请注意,即使E低于D,当我们初次访问root时也已经输出了E)。
- 我遍历A,因为我们不需要遍历左侧路径,所以继续往上走。
- 现在我又回到了根,需要同时拥有两个遍历。由于我们刚走左,所以我们向右走。在那里,我们看到点G
- 点G的y(6)> = 1吗?是
- [-2:4]中的点G(3)的x是吗?是
- 将G添加到输出中,现在我们有了(E,G)。
- G的中点在范围内吗?是的,我们必须穿越双方。
- 让我们先左走。我们看到F。
- 点F(1)的y是否等于1?是
- [-2:4]中的点F(2)的x是吗?是
- 将F添加到输出中,现在我们有了(E,F,G)
- 是[-2中F的中点: 4]?是的,我们必须横穿双方。
- 让我们再一次左转,我们看到... NIL。下面没有其他要添加的点(因为我们之前已经检查/添加了F,G)。
- 让我们回到F并向右下方移动,我们看到H。
- 点H(-4)的y是否等于1?否。不要添加点并停止遍历。
- 我们回到已经遍历了两条路径的F。
- 我们去返回到G,它仍然需要正确的路径遍历。
- 我们遍历正确的路径,然后看到I。
- 是点y我(3)> = 1?是
- [-2:4]中的点I(5)的x是吗?否。
- 跳过F。
- I的中点是否在[-2,4]范围内?不,它更大。
- 因为它更大,所以我们只需要遍历左侧,所以我们就可以做到这一点。
- 遍历左侧,我们再见...我由于我们已经看到了I,并且它是一个叶节点,因此我们停止遍历并返回到I。
- 我完成了(不需要向右遍历)。返回到G。
- G完成(两边都经过)。返回到E。
- E完成(两边都经过)。由于E是根节点,所以我们现在完成了。
结果是 E,F,G在范围内。 / p>
The only reasonable slide set I found is this, which in page 15 says, for building:
- Sort all points by their x coordinate value and store them in the leaf nodes of a balanced binary tree (i.e., a range tree)
- Starting at the root, each node contains the point in its subtree with the maximum value for its y coordinate that has not been stored
at a shallower depth in the tree; if no such node exists, then node
is empty
I implemented a Range Tree just before, so based on the example I used there, I now have (for a Priority Search Tree):
7
------ 0 ---------------------
| |
6 6
---- -3 ---- ---- 4 ------
| | | |
4 2 1 3
---- -4 - -2 --- 3 --- 5
| | / \ | | / \
0 (-3,4) (-2,2)(0,7) NIL (4,-4) (5,3)(6,-1)
-5 2
/ \ / \
(-5,6) (-4,0) (2,1) (3,6)
Here, every inner node is of the form:
mid_x
max y
and the range query I am posing is (-inf, 1) x (-0.5, 5), i.e. (x_low, x_high) x (y_low, y_high).
- So, let's start from the root, I check if x (=0) is in (-inf, 1) and if 7 > (-0.5, 5). It is, thus I proceed in both children, but for simplicity, let me follow the left one (in all cases from now).
- I check if -3 is the x range and if 6 is more or equal than the upper bound of the y range of the query (i.e. 5). It is, so I continue.
- Same for the next level, thus we go to the next level and now please focus on this inner node. It has as max y a 0, which indicates that the max y of the subtree is 0, which is incorrect (left child is (-5, 6))*.
What am I missing in building/searching process?
In other words:
Check the leftmost branch of the tree; it has as max_y
values (2nd bullet in the quote), 7, 6, 4, 0.
But isn't that value the one that indicated the maximum y value stored in the subtree under the inner node? If so, this does not hold for 0
and point (-5, 6)
(6 is not used, since its used in the 2nd level).
*The particular query I am posing might not be damaged by that, but another one can.
You last logic is actually still correct. The (-5,6) value should've already been picked up when you hit the node you labelled (6,-3). Now, I'm no math major. But the general idea is this. In the Priority Search tree as you implemented, you're actually searching on two separate criteria. For x, it's a simple binary search for the range. For y, you're actually searching for it as a priority tree (good for search of y:inf or -inf:y, depending on your whether you use max or min.)
Note that at the bottom of page 15, it states that the tree is good for a semi-infinite range (one is infinite). Keep reading down, you'll see how the tree is optimized for semi-infinite range for y.
In short, since your tree is constructed with x as the binary search and y as a priority (using max remaining value), the optimal search is [x1:x2],[y1:inf].
Each node in the tree essentially stores 2 things. 1. The mid-point of x (or the max-x in the left tree, and the decision to traverse left or right is based on the >x check). 2. The POINT with the highest y value in the subtree that had not been added to previous one.
The search algorithm essentially goes like this. Starting from the root using a criteria of [x1:x2], [y1:inf] 1. Check the y-value of the POINT assigned to the node. If y > y1, go to 2, otherwise STOP traversing down (since the POINT assigned to the node has the highest y value, if the check failed, there's no other node beneath it that can fulfill [y1:inf]. 2. Check if the x-value of the point is in range of [x1:x2]. If so, include that point in the output. Continue to step 3 regardless if you included that point. 3. Check the node's "mid-point" value (let's call that mx). If mx is in range of [x1:x2], traverse down both path. If mx is < [x1:x2], traverse left. Otherwise traverse right. Go back to step 1 for each path you traverse.
EDIT for very, VERY long text: Let's run through your example. I've made an additional annotation marking each point using letter (the point's "name"). Each node now have the format of name of the point, with it's y-value in the parenthsis, and the "mid-range" x below it. For the leaf nodes, those with an '*' means they are already assigned to somewhere up the tree, and should be treated as NIL if you ever reach them.
7(E)
------ 0 ---------------------
| |
A(6) G(6)
----- -3 ----- ---- 4 --------
| | | |
C(4) D(2) F(1) I(3)
---- -4 - -2 --- 3 --- 5
| | / \ | | / \
B(0) C*(-3,4)D*(-2,2)E*(0,7) NIL H(4,-4) I*(5,3)J(6,-1)
-5 2
/ \ / \
A*(-5,6)B*(-4,0) F*(2,1) G*(3,6)
Let's run an example query of [-2:4] [1:inf](or any y >= 1)
- Starting from root, we see point E.
- Is the y of point E (7) >= 1? Yes.
- Is the x of point E (0) in [-2:4]? Yes
- Add E to output.
- Is the mid-point (also 0) in range? Yes
- From the node with E, need to traverse both side.
- First, let's go left, we see point A.
- Is the y of point A(6) >= 1? Yes.
- Is the x of point A(-5) in [-2:4]? No.
- Skip A, but continue traverse (only the y check stops traversion).
- Is the mid-point at A in range of [-2:4]? No, it's to the left.
- Since -3 < [-2:4], we only need to traverse right. Now we see point D.
- Is the y of point D(2) >= 1? No! Skip the point and stop traversing down, since there's no other point under D that we have NOT outputted (note, even if E is below D, E is already outputted when we visited root at the beginning).
- I traverse up to A, since we don't need to traverse the left path, keep going up.
- Now I'm back at root, which needs to have both traversed. Since we just went left, we're going right. There, we see point G
- Is the y of point G (6) >= 1? Yes
- Is the x of point G (3) in [-2:4]? Yes
- Add G to output, now we have (E,G).
- Is the mid-point at G in range? Yes, we'll have to traverse both sides.
- Let's go left first. We see F.
- Is the y of point F (1) >= 1? Yes
- Is the x of point F (2) in [-2:4]? Yes
- Add F to output, now we have (E,F,G)
- Is the mid-point at F in [-2:4]? Yes, we'll have to traverse both sides.
- Let's go left first again, we see... NIL. There's no more points to be added below (since we already checked/added F,G before).
- Let's go back up to F and travel down the right side, we see H.
- Is the y of point H (-4) >= 1? No. Don't add the point and stop traversing.
- We go back up to F, which already has both path traversed.
- We go back up to G, which still needs its right path traverse.
- We traverse down rigth path, and see I.
- Is the y of point I (3) >= 1? Yes
- Is the x of point I (5) in [-2:4]? No.
- Skip F.
- Is the mid-point at I in range of [-2,4]? No, it's greater.
- Since it's greater, we only need to traverse the left side, so let's do that.
- Traverse down left side we see... I, again. Since we already seen I, and it's a leaf node, we stop traversing and go back up to I.
- I is done (don't need to traverse right). Go back up to G.
- G is done (both sides traversed). GO back up to E.
- E is done (both sides traversed). Since E is root node, we're now done.
The result is "E,F,G" are in range.
这篇关于优先搜索树混乱的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!