AArch64重定位前缀 [英] AArch64 relocation prefixes

查看:512
本文介绍了AArch64重定位前缀的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我注意到用于ARM 64位汇编的GNU asm 重定位语法.那些像#:abs_g0_nc::pg_hi21:的东西是什么?他们在哪里解释?他们有模式吗?还是在忙碌中弥补它们?在哪里可以了解更多信息?

I noticed a GNU asm relocation syntax for ARM 64-bit assembly. What are those pieces like #:abs_g0_nc: and :pg_hi21:? Where are they explained? Is there a pattern to them or are they made up on the go? Where can I learn more?

推荐答案

简介

ELF64 定义了两种类型的重定位条目,称为 REL RELA :

Introduction

ELF64 defines two types of relocation entries, called REL and RELA:

typedef struct
{
    Elf64_Addr r_offset;    /* Address of reference */
    Elf64_Xword r_info;     /* Symbol index and type of relocation */
} Elf64_Rel;

typedef struct
{
    Elf64_Addr r_offset;    /* Address of reference */
    Elf64_Xword r_info;     /* Symbol index and type of relocation */
    Elf64_Sxword r_addend;  /* Constant part of expression */
} Elf64_Rela;

每个重定位条目的范围是为加载程序(静态或动态)提供四段信息:

The scope of each relocation entry is to give the loader (static or dynamic) four pieces of information:

  • 虚拟地址或要修补的指令的偏移量.
    这是由r_offset给出的.

  • The virtual address or the offset of the instruction to patch.
    This is given by r_offset.

所访问符号的运行时地址.
这是由r_info的较高部分给出的.

The runtime address of the symbol accessed.
This is given by the higher part of r_info.

名为addend的自定义值
此值最终用作表达式中的操作数,该表达式用于计算将为修补指令而写入的值.
RELA条目在r_addend中具有此值,REL条目从重定位站点中提取该值.

A custom value called addend
This value, eventually, as an operand in the expression used to calculate the value that will be written to patch the instruction.
RELA entries have this value in r_addend, REL entries extract it from the relocation site.

重定位类型 这确定用于计算值以修补指令的表达式类型.这是在r_info的下部编码的.

The relocation type This determines the type of expression uses to calculate the value to patch the instruction. This is encoded in the lower part of r_info.

在重定位阶段,装入程序会遍历所有重定位条目,并使用r_info下部选择的公式写入每个r_offset指定的位置,以计算要从 addend (对于RELA,为r_addend)和符号地址(可从r_info的上部获得).

During the relocation phase the loader goes through all the relocation entries and write to the location specified by each r_offset, using a formula chosen by the lower part of r_info to compute the value to be stored from the addend (r_addend for RELA) and the symbol address (obtainable from the upper part of r_info).

实际上,与其他架构相反,在其他架构中,指令的立即数字段通常与用于对操作进行编码的字节完全分开,因此,写入部分已得到简化,在ARM中,立即数与其他编码信息混合. br> 因此,如果加载程序完全是 1 指令,那么加载程序应该知道尝试重定位的是哪种指令,但不是由汇编程序来设置重定位类型,而是让汇编程序拆开重定位位置根据说明.

Actually the write part has been simplified, contrary to other architecture where the immediate field of an instruction usually occupy entirely separate byes from the ones used to encode the operation, in ARM, the immediate value is mixed with other encoding information.
So the loader should know what kind of instruction is trying to relocate, if it is an instruction at all1, but instead of letting it disassemble the site of relocation, it is the assembler that set the relocation type according to the instruction.

每个重定位符号只能重定位一两个编码等效的指令.
在特定情况下,重定位本身甚至会更改指令的类型.

Each relocation symbol can relocate only one or two, encoding-equivalent, instructions.
In specific case the relocation itself even change the type of instruction.

在重定位期间计算的值计算将隐式扩展为64位,根据选择的重定位类型有符号或无符号.

The value compute computed during the relocation is implicitly extended to 64 bits, signed or unsigned based on the relocation type chosen.

作为ARM的RISC体系结构具有固定的指令大小,将全宽度(即64位)加载到寄存器中是不容易的,因为没有指令可以具有全宽度立即数字段.

Being ARM a RISC architecture with fixed instruction size, loading full width, i.e. 64 bits, immediate into a register is non trivial as no instruction can have a full width immediate field.

AArch64中的重定位也必须解决这个问题,它实际上是两个问题:首先,找到程序员打算使用的实际值(这是问题的纯重定位部分);其次,找到一种将其放入寄存器的方法,因为没有指令具有64位立即数字段.

Relocation in AArch64 has to address this issue too, it is actually a two fold problem: first, find the real value that the programmer intended to use (this is the pure relocation part of the problem); second, find a way to put it into a register, since no instruction has a 64 bits immediate field.

第二个问题通过使用 group relocation 解决,组中的每个重定位类型都用于计算64位值的16位部分,因此,一个a中只能有四个重定位类型.组(从 G0 G3 ).

The second issue is addressed by using group relocation, each relocation type in a group is used to compute a 16 bits part of the 64 bits value, therefore there can only be four relocation type in a group (ranging from G0 to G3).

此切片分为16位,以适应movk(移动保持),movz(移动归零)和movn(在逻辑上进行求反). 其他指令,例如bbladrpadr等,具有特别适合它们的重定位类型.

This slicing into 16 bits comes to fit with the movk (move keeping), movz (move zeroing) and movn (move negating logically).
Other instructions, like b, bl, adrp, adr and so on, have a relocation type specially suited for them.

只要给定引用符号的指令只有一种(因此明确)可能的重定位类型,则汇编器可以生成相应的条目,而无需程序员明确指定它.

Whenever there is only one, thus unambiguous, possible relocation type for a given instruction that reference a symbol, the assembler can generate the corresponding entry without the need, for the programmer, to specify it explicitly.

组重定位不属于此类别,它们的存在是为了使程序员有一定的灵活性,因此通常进行明确说明. 在一个组中,重定位类型可以指定汇编程序是否必须执行溢出检查.
G0 重定位,用于加载值的低16位(除非明确禁止),请检查该值是否适合16位(带符号或无符号,具体取决于所使用的特定类型). G1 也是如此,加载第31-16位会检查值是否适合32位.
因此, G3 总是不检查,因为每个值都适合64位.

Group relocation doesn't fit into this category, they exist to allow the programmer some flexibility, thus are generally explicitly stated. In a group, a relocation type can specify if the assembler must perform an overflow check or not.
A G0 relocation, used to load the lower 16 bits of a value, unless explicitly suppressed, check that the value can fit 16 bits (signed or unsigned, depending on the specific type used). The same is true for G1, that loading bits 31-16 check that the values can fits 32 bits.
As a consequence G3 is always non checking as every value fits 64 bits.

最后,重定位可用于将整数值加载到寄存器中. 实际上,符号的地址不过是一个任意的整数常量.
请注意,r_addend为64位宽.

Finally, relocation can be used to load integer values into register. In fact, an address of a symbol is nothing more than an arbitrary integer constant.
Note that r_addend is 64 bits wide.

1 如果r_offset指向数据节中的某个位置,则计算出的值将以64位字的形式写入所指示的位置.

1 If r_offset points to a site in a data section the computed value is written as 64 bits word at the location indicated.

首先,一些参考文献:

  • 描述ELF64格式的重定位类型的ARM文档为此处,第4.6节

假定包含所有可供GAS使用的重定位运算符的测试AArch64汇编文件为

A test AArch64 assembly file that, presumably, contains all the relocation operators available to GAS is here here

遵循ARM文档约定,我们有:

Following the ARM document convention we have:

S是要重定位的符号的运行时地址.
A是 用于重定位的加数.
P是重定位站点的地址 (源自r_offset).
X是重定位的结果 操作,然后再应用任何屏蔽或位选择操作.
Page(expr)是表达式expr的页面地址,定义为 expr & ~0xFFF,即expr,低12位被清除. GOT

S is the runtime address of the symbol being relocated.
A is the addend for the relocation.
P is the address of the relocation site (derived from r_offset).
X is the result of a relocation operation, before any masking or bit-selection operation is applied.
Page(expr) is the page address of the expression expr, defined as expr & ~0xFFF, i.e. expr with the lower 12 bits cleared. GOT is the address of the
Global Offset Table.
GDAT(S+A) represents a 64-bit entry in the GOT for address S+A. The entry will be relocated at run time with relocation R_AARCH64_GLOB_DAT(S+A).
G(expr) is the address of the GOT entry for the expression expr.
Delta(S) resolves to the difference between the static link address of S and the execution address of S. If S is the null symbol (ELF symbol index 0), resolves to the difference between the static link address of P and the execution address of P.
Indirect(expr) represents the result of calling expr as a function.
[msb:lsb] is a bit-mask operation representing the selection of bits in a value, bounds are inclusive.

操作员

为紧凑起见,重定位名称缺少前缀R_AARCH64_.

类型为 | X |≤2^ 16 的表达式旨在为 -2 ^ 16≤X< 2 ^ 16 请注意右边的严格不等式.
这是对表格式格式化的限制所致的滥用表示法.

Expressions of the kind |X|≤2^16 are intended as -2^16 ≤ X < 2^16, note the strict inequality on the right.
This is an abuse of notation, called by the constrains of formatting a table.

组重定位

Operator    | Relocation name | Operation | Inst | Immediate | Check
------------+-----------------+-----------+------+-----------+----------
:abs_g0:    | MOVW_UABS_G0    | S + A     | movz | X[15:0]   | 0≤X≤2^16
------------+-----------------+-----------+------+-----------+----------
:abs_g0_nc: | MOVW_UABS_G0_NC | S + A     | movk | X[15:0]   | 
------------+-----------------+-----------+------+-----------+----------
:abs_g1:    | MOVW_UABS_G1    | S + A     | movz | X[31:16]  | 0≤X≤2^32
------------+-----------------+-----------+------+-----------+----------
:abs_g1_nc: | MOVW_UABS_G1_NC | S + A     | movk | X[31:16]  | 
------------+-----------------+-----------+------+-----------+----------
:abs_g2:    | MOVW_UABS_G2    | S + A     | movz | X[47:32]  | 0≤X≤2^48
------------+-----------------+-----------+------+-----------+----------
:abs_g2_nc: | MOVW_UABS_G2_NC | S + A     | movk | X[47:32]  | 
------------+-----------------+-----------+------+-----------+----------
:abs_g3:    | MOVW_UABS_G3    | S + A     | movk | X[64:48]  | 
            |                 |           | movz |           |
------------+-----------------+-----------+------+-----------+----------
:abs_g0_s:  | MOVW_SABS_G0    | S + A     | movz | X[15:0]   | |X|≤2^16
            |                 |           | movn |           |
------------+-----------------+-----------+------+-----------+----------
:abs_g1_s:  | MOVW_SABS_G1    | S + A     | movz | X[31:16]  | |X|≤2^32
            |                 |           | movn |           |
------------+-----------------+-----------+------+-----------+----------
:abs_g2_s:  | MOVW_SABS_G2    | S + A     | movz | X[47:32]  | |X|≤2^48
            |                 |           | movn |           |
------------+-----------------+-----------+------+-----------+----------

在表中显示了 ABS 版本,汇编程序可以选择 PREL (相对于PC)或 GOTOFF (相对于GOT)版本取决于所引用的符号和输出格式的类型.

In the table the ABS version is showed, the assembler can pickup the PREL (PC relative) or the GOTOFF (GOT relative) version depending on the symbol referenced and the type of output format.

此重定位运算符的典型用法是

A typical use of this relocation operators is

Unsigned 64 bits                      Signed 64 bits   
movz    x1,#:abs_g3:u64               movz  x1,#:abs_g3_s:u64
movk    x1,#:abs_g2_nc:u64            movk  x1,#:abs_g2_nc:u64
movk    x1,#:abs_g1_nc:u64            movk  x1,#:abs_g1_nc:u64
movk    x1,#:abs_g0_nc:u64            movk  x1,#:abs_g0_nc:u64

通常使用一个检查符,设置最高部分的那个.
这就是为什么检查版本仅重定位movz而非检查版本重定位movk(部分设置寄存器)的原因. G3 都进行了重定位,因为它本质上是不检查的,因为任何值都不能超过64位.

Usually one one checking operator is used, the one that set the highest part.
That's why checking version relocates movz only, while the non checking version relocates movk (which partially set a register).
G3 relocated both because it is intrinsically non checking as no value can exceed 64 bits.

签名版本以_s结尾,并且始终处于检查状态.
没有 G3 版本,因为如果使用64位值,则必须在值本身中指定该符号.
它们始终仅用于设置最高部分,因为符号仅在此处相关.
它们始终在检查,因为有符号的值中的溢出会导致该值的含义变小.
这些重定位根据值的符号将指令的类型更改为movnmovz,该符号有效地扩展了值.

The signed versions ends with _s and they are always checking.
There is no G3 version because if a 64 bits value is used the sign if sully specified in the value itself.
They are always used only to set the highest part, as the sign is relevant only there.
They are always checking as an overflow in a signed value make the value meaning less.
These relocations change the type of the instruction to movn or movz based on the sign of the value, this effectively sign extend the value.

也可进行组重定位

相对于PC的19、21、33位地址

Operator    | Relocation name | Operation | Inst | Immediate | Check
------------+-----------------+-----------+------+-----------+----------
[implicit]  | LD_PREL_LO19    | S + A - P | ldr  | X[20:2]   | |X|≤2^20
------------+-----------------+-----------+------+-----------+----------
[implicit]  | LD_PREL_LO21    | S + A - P | adr  | X[20:0]   | |X|≤2^20
------------+-----------------+-----------+------+-----------+----------
[implicit]  | LD_PREL_LO21    | S + A - P | adr  | X[20:0]   | |X|≤2^20
------------+-----------------+-----------+------+-----------+----------
:pg_hi21:   | ADR_PREL_PG     | Page(S+A) | adrp | X[31:12]  | |X|≤2^32
            | _HI21           | - Page(P) |      |           |
------------+-----------------+-----------+------+-----------+----------
:pg_hi21_nc:| ADR_PREL_PG     | Page(S+A) | adrp | X[31:12]  | 
            | _HI21_NC        | - Page(P) |      |           |
------------+-----------------+-----------+------+-----------+----------
:lo12:      | ADD_ABS_LO12_NC | S + A     | add  | X[11:0]   | 
------------+-----------------+-----------+------+-----------+----------
:lo12:      | LDST8_ABS_LO12  | S + A     | ld   | X[11:0]   | 
            | _NC             |           | st   |           |
------------+-----------------+-----------+------+-----------+----------
:lo12:      | LDST16_ABS_LO12 | S + A     | ld   | X[11:1]   | 
            | _NC             |           | st   |           |
------------+-----------------+-----------+------+-----------+----------
:lo12:      | LDST32_ABS_LO12 | S + A     | ld   | X[11:2]   | 
            | _NC             |           | st   |           |
------------+-----------------+-----------+------+-----------+----------
:lo12:      | LDST64_ABS_LO12 | S + A     | prfm | X[11:3]   | 
            | _NC             |           |      |           |
------------+-----------------+-----------+------+-----------+----------
:lo12:      | LDST128_ABS     | S + A     | ?    | X[11:4]   | 
            | _LO12_NC        |           |      |           |

:lo12:更改的含义取决于指令正在处理的数据大小(例如,ldrb使用LDST8_ABS_LO12_NCldrh使用LDST16_ABS_LO12_NC).

The :lo12: change meaning depending on the size of the data the instruction is handling (e.g. ldrb uses LDST8_ABS_LO12_NC, ldrh uses LDST16_ABS_LO12_NC).

这些重定位的GOT相对版本也已存在,汇编器将选择正确的版本.

A GOT relative version of these relocations also exists, the assembler will pickup the right one.

控制流重定位

Operator    | Relocation name | Operation | Inst | Immediate | Check
------------+-----------------+-----------+------+-----------+----------
[implicit]  | TSTBR14         | S + A - P | tbz  | X[15:2]   | |X|≤2^15
            |                 |           | tbnz |           |  
------------+-----------------+-----------+------+-----------+----------
[implicit]  | CONDBR19        | S + A - P | b.*  | X[20:2]   | |X|≤2^20
------------+-----------------+-----------+------+-----------+----------
[implicit]  | JUMP26          | S + A - P | b    | X[27:2]   | |X|≤2^27
------------+-----------------+-----------+------+-----------+----------
[implicit]  | CALL26          | S + A - P | bl   | X[27:2]   | |X|≤2^27
------------+-----------------+-----------+------+-----------+----------

结语

我找不到官方文档.
上表是根据GAS测试用例和ARM文档重建而成的,该文档解释了可用于AArch64兼容ELF的重定位类型.

Epilogue

I couldn't find an official documentation.
The tables above have been reconstructed from the GAS test case and the ARM document explaining the type of relocations available for AArch64 compliant ELFs.

这些表并未显示ARM文档中存在的所有重定位,因为它们中的大多数是互补版本,由汇编程序自动获取.

The tables doesn't show all the relocations present in the ARM document, as most of them are complementary versions, picked up by the assembler automatically.

一个带有示例的部分会很棒,但是我没有ARM GAS.
将来,我可能会扩展此答案,以包括程序集列表和重定位转储的示例.

A section with examples would be great, but I don't have an ARM GAS.
In the future I may extend this answer to include examples of assembly listings and relocations dumps.

这篇关于AArch64重定位前缀的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆