为什么“开始小”分支位移算法不是最优的? [英] Why is the "start small" algorithm for branch displacement not optimal?

查看:151
本文介绍了为什么“开始小”分支位移算法不是最优的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



当汇编程序生成二进制编码时,需要决定是否使每个分支长或短,缩短更好,如果可能的话。汇编器的这一部分称为分支位移优化(BDO)算法。一个典型的方法是汇编器使分支编码变得简单(如果它们小于某个阈值),则迭代地增加任何分支跳转到未达到的分支。这当然可以导致其他分支转换为长跳。因此,汇编程序必须保持通过跳转列表,直到不再需要增加大小。这种二次时间方法对我来说似乎是一个最佳算法,但据推测,BDO是NP完整的,这种方法实际上不是最优的。



Randall Hyde提供了一个反向例如:

  .386 
.model flat,syscall
00000000 .code
00000000 _HLAMain proc


00000000 E9 00000016 jmpLbl:jmp [near ptr] target
00000005 = 00000005 jmpSize = $ -jmpLbl
00000005 00000016 [byte 32 - jmpSize * 2
dup
(0)
00
]
0000001B目标:
0000001B _HLAMain结束
结束

通过在括号[near ptr]中添加部分并强制执行5字节编码,二进制实际上最终会缩短,因为分配的rray是跳跃大小的两倍。因此,通过使跳转编码更短,最终代码实际上更长。



这对我来说似乎是一个非常病态的情况,并不真正相关,因为分支编码是仍然较小,它只是对程序的非分支部分的这种奇怪的副作用,导致二进制数变大。由于分支编码本身仍然较小,所以我并不认为是开始小算法的有效反例。



我可以考虑初始小算法是最佳BDO算法,还是存在一种真实的案例,其中不为所有分支提供最小的编码大小?

解决方案

这是一个证明,在评论中没有harold提到的异常跳转的情况下,开始小算法是最优:



首先,让我们确定开始小始终会生成一个可行解决方案 - 也就是说,不包含任何短编码太长的跳跃。算法本质上相当于反复提出可行吗?这个问题。并且如果不是,则延长一些跳转编码,所以如果终止则清楚地,那么它产生的解决方案必须是可行的。由于每个迭代都会延长一些跳转,并且不会跳过多次,所以该算法最终在最多nJumps迭代后终止,所以解决方案必须可行。



现在假设相反,算法可以产生次优解。令Y是一些最优解。我们可以将解决方案表示为延长的跳转指令的子集。我们知道| X \ Y | > = 1 - 也就是说,在X中至少有1个指令在Y中也不是Y,因为否则X将是Y的子集,并且由于Y是假设的最优,X已知是可行的,则X = Y,这意味着X本身就是一个最优解,这与X的原始假设相矛盾。



从X \\ \\ Y,选择我是首先被开始小算法延长的那个,并且让Z是由这个时间之前已经被算法延长的所有指令组成的Y(和X)的子集。由于开始小算法决定延长我的编码,所以必须在这个时间点(即,在延长Z中的所有指令之后),我的跳跃位移对于短的编码来说太大了。 (请注意,虽然Z中的一些加长可能推动了我的跳跃位移超过了关键点,但这绝对不是必要的 - 也许我的排位从一开始就高于门槛,我们所有人都需要知道知道,是我的跳跃位移在Z已经被处理完成的时候高于阈值。)但是现在回顾一下最优解Y,并注意到Y中没有其他的延长 - 即在Y因为我的位移超过阈值,但是它的编码不是延长了Y,Y甚至不可行!一个不可行的解决方案不是最佳的,所以在Y中存在这样一个不加长的指令将与Y是最优的假设相抵触 - 这意味着我不能存在这样的一个假设。


[question in bold at the bottom]

When an assembler generates a binary encoding it needs to decide whether to make each branch long or short, short being better, if possible. This part of the assembler is called the branch displacement optimization (BDO) algorithm. A typical approach is that the assembler makes all the branch encodings short (if they are less than some threshold), then iteratively increases any branches jump to longs that do not reach. This, of course, can cause other branches to be converted to long jumps. So, the assembler has to keep passing through the jump list until no more upsizing is required. This quadratic time approach would seem to be an optimal algorithm to me, but supposedly BDO is NP-complete and this approach is not actually optimal.

Randall Hyde provided a counter-example:

                  .386 
                  .model  flat, syscall
00000000          .code
00000000          _HLAMain        proc


00000000  E9 00000016          jmpLbl:         jmp     [near ptr] target
00000005 = 00000005            jmpSize         =       $-jmpLbl
00000005  00000016 [                           byte    32 - jmpSize*2
dup
(0)
            00
           ]
0000001B                       target:
0000001B                       _HLAMain        endp
                                                end

By adding the part in brackets "[near ptr]" and forcing a 5-byte encoding the binary actually ends up being shorter because the allocated array is smaller by double the amount of the jump size. Thus, by making a jump encoding shorter, the final code is actually longer.

This seems like an extremely pathological case to me, and not really relevant because the branch encodings are still smaller, its just this bizarre side-effect on a non-branch part of the program, that causes the binary to become larger. Since the branch encodings themselves are still smaller, I don't really consider that a valid counter-example to the "start small" algorithm.

Can I consider the start-small algorithm an optimal BDO algorithm or does there exist a realistic case in which it does not provide a minimal encoding size for all branches?

解决方案

Here's a proof that, in the absence of the anomalous jumps mentioned by harold in the comments, the "start small" algorithm is optimal:

First, let's establish that "start small" always produces a feasible solution -- that is, one that doesn't contain any short encoding of a too-long jump. The algorithm essentially amounts to repeatedly asking the question "Is it feasible yet?" and lengthening some jump encoding if not, so clearly if it terminates, then the solution it produces must be feasible. Since each iteration lengthens some jump, and no jump is ever lengthened more than once, this algorithm must eventually terminate after at most nJumps iterations, so the solution must be feasible.

Now suppose to the contrary that the algorithm can produce a suboptimal solution X. Let Y be some optimal solution. We can represent a solution as the subset of jump instructions that are lengthened. We know that |X \ Y| >= 1 -- that is, that there is at least 1 instruction lengthening in X that is not also in Y -- because otherwise X would be a subset of Y, and since Y is optimal by assumption and X is known to be feasible, it would follow that X = Y, meaning that X would itself be an optimal solution, which would contradict our original assumption about X.

From among the instructions in X \ Y, choose i to be the one that was lengthened first by the "start small" algorithm, and let Z be the subset of Y (and of X) consisting of all instructions already lengthened by the algorithm prior to this time. Since the "start small" algorithm decided to lengthen i's encoding, it must have been the case that by that point in time (i.e., after lengthening all the instructions in Z), i's jump displacement was too big for a short encoding. (Note that while some of the lengthenings in Z may have pushed i's jump displacement past the critical point, this is by no means necessary -- maybe i's displacement was above the threshold from the beginning. All we can know, and all we need to know, is that i's jump displacement was above the threshold by the time Z had finished being processed.) But now look back at the optimal solution Y, and note that none of the other lengthenings in Y -- i.e., in Y \ Z -- are able to reduce i's jump displacement back down, so, since i's displacement is above the threshold but its encoding is not lengthened by Y, Y is not even feasible! An infeasible solution cannot be optimal, so the existence of such a non-lengthened instruction i in Y would contradict the assumption that Y is optimal -- meaning that no such i can exist.

这篇关于为什么“开始小”分支位移算法不是最优的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆