跳到基本块的中间 [英] Jump in the middle of basic block

查看:149
本文介绍了跳到基本块的中间的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

基本块定义为以(直接或间接)跳转指令结尾的(非跳转)指令序列.跳转目标地址应该是另一个基本块的开始.考虑一下我有以下汇编代码:

A basic block is defined as a sequence of (non-jump) instructions ending with a jump (direct or indirect) instruction. The jump target address should be the start of another basic block. Consider I have the following assembly code :

106ac:       ba00000f        blt     106f0 <main+0xb8>
106b0:       e3099410        movw    r9, #37904      ; 0x9410
106b4:       e3409001        movt    r9, #1
106b8:       e79f9009        ldr     r9, [pc, r9]
106bc:       e3a06000        mov     r6, #0
106c0:       e1a0a008        mov     sl, r8
106c4:       e30993fc        movw    r9, #37884      ; 0x93fc
106c8:       e3409001        movt    r9, #1
106cc:       e79f9009        ldr     r9, [pc, r9]
106d0:       e5894000        str     r4, [r9]
106d4:       e7941105        ldr     r1, [r4, r5, lsl #2]
106d8:       e1a00007        mov     r0, r7
106dc:       e12fff31        blx     r1
106e0:       e0806006        add     r6, r0, r6
106e4:       e25aa001        subs    sl, sl, #1
106e8:       e287700d        add     r7, r7, #13
106ec:       1afffff4        bne     106c4 <main+0x8c>
106f0:       e30993d0        movw    r9, #37840      ; 0x93d0
106f4:       e3409001        movt    r9, #1

bb1

106a4:       ...
106ac:       ba00000f        blt     106f0 <main+0xb8>

第一个基本块bb1的目标地址是bb4的开头.

The first basic block bb1 has a target address which is the start of bb4.

bb2

106b0:       e3099410        movw    r9, #37904      ; 0x9410
....        All other instructions
106c4:       e30993fc        movw    r9, #37884      ; 0x93fc
....        All other instructions
106d8:       e1a00007        mov     r0, r7
106dc:       e12fff31        blx     r1

第二个基本块bb2具有一个间接分支,因此仅在运行时才知道目标地址.

The second basic block bb2 has an indirect branch so the target address is known only at runtime.

bb3

106e0:       e0806006        add     r6, r0, r6
106e4:       e25aa001        subs    sl, sl, #1
106e8:       e287700d        add     r7, r7, #13
106ec:       1afffff4        bne     106c4 <main+0x8c>

第三个基本块的目标地址不是另一个基本块的开始,而是在bb2的中间. 根据基本块的定义,这是不可能的.但是,实际上,我在多个地方都看到了这种行为(在基本块中间跳跃).如何解释这种行为?是否有可能强制编译器(LLVM)生成除了基本块的开头之外不会跳到其他任何地方的代码?

The third basic block has a target address which is not the start of another basic block but it is in the middle of bb2. According to the definition of a basic block, it is not possible. But, in practice, I am seeing this behavior (jumps in the middle of basic blocks) in multiple places. How to explain this behavior ? Is it possible to force a compiler (LLVM) to generate code that does not jump anywhere else except at the beginning of a basic block ?

bb4

106f0:       e30993d0        movw    r9, #37840      ; 0x93d0
106f4:       e3409001        movt    r9, #1
....
Ends with a branch (direct or indirect)

我正在使用工具(SPEDI)生成基本块,并且使用的编译器是LLVM(CLANG前端),目标体系结构是ARM V7 Cortex-A9.

I am generating basic blocks using a tool (SPEDI) and the compiler used is LLVM (CLANG front end) and the targeted architecture is ARM V7 Cortex-A9.

推荐答案

基本块是控件流程图中的节点,这意味着一旦控件进入该块,除了运行整个块外,它无法执行其他任何操作阻止并退出它.这并不意味着它们必须以跳转指令开始或结束.为了更好地理解,请参考 Wikipedia 的摘录:

Basic blocks are the nodes in the control flow graph, which means that once control enters the block, it can't do anything else apart from running through the whole block and exiting it. It doesn't mean that they have to start or end with a jump instruction. For better understanding refer to this excerpt from Wikipedia:

由于其构造过程,在CFG中,每个A→B边都具有 该属性:

Because of its construction procedure, in a CFG, every edge A→B has the property that:

outdegree(A)> 1或indegree(B)> 1(或两者).

outdegree(A) > 1 or indegree(B) > 1 (or both).

因此CFG可以是 从程序的(完整的)开始,至少在概念上获得了 流程图-即每个节点代表一个个体的图 指令-并对每个边缘进行边缘收缩 伪造上述谓词,即收缩其 源只有一个出口,目的地只有一个出口.

The CFG can thus be obtained, at least conceptually, by starting from the program's (full) flow graph—i.e. the graph in which every node represents an individual instruction—and performing an edge contraction for every edge that falsifies the predicate above, i.e. contracting every edge whose source has a single exit and whose destination has a single entry.

根据这个定义,我将对106b0和106ec之间的代码进行不同的分析:一个从106b0到106c0的块B1,另一个从106c4到106ec的块B2.该代码是一个循环,B1是循环的设置部分,B2是主体.

According to this definition I would analyze code between 106b0 and 106ec differently: one block B1 from 106b0 to 106c0, and one block B2 from 106c4 to 106ec. This code is a loop, B1 is the setup part of the loop and B2 is the body.

在ARM中,诸如106dc处的bl指令是一个函数调用,这意味着控制将传递给被调用的函数,然后在bl之后立即返回到该指令.因此,如果我们仅为调用函数构造CFG,则不会将该指令视为块边界.如果我们要对整个程序进行CFG处理,那么这里应该有一个指向被调用函数的边,然后是另一个从被调用函数返回到下一条指令的边.

In ARM a bl instruction such as the one at 106dc is a function call, meaning that control will be passed to the called function but then returned to the instruction right after the bl. So if we're only constructing the CFG for the calling function I wouldn't consider this instruction as a block boundary. If we're doing the CFG for the whole program there should be an edge towards the called function here and then another edge back from the called function to the next instruction.

这篇关于跳到基本块的中间的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆