了解 ARM 重定位(例如:str x0, [tmp, #:lo12:zbi_paddr]) [英] Understanding ARM relocation (example: str x0, [tmp, #:lo12:zbi_paddr])

查看:106
本文介绍了了解 ARM 重定位(例如:str x0, [tmp, #:lo12:zbi_paddr])的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 锆石内核启动.S

str x0, [tmp, #:lo12:zbi_paddr]

用于 ARM64.我还发现 zbi_paddr 是在 C++ 中定义的:

extern paddr_t zbi_paddr;

所以我开始研究 #:lo12: 是什么意思.

我发现 https://stackoverflow.com/a/38608738/6655884 看起来不错,但是它没有解释最基本的:什么是重定位以及为什么需要一些东西.

我猜因为 zbi_paddrrstart.S 中定义并在 C++ 代码中使用,因为 start.S 在目标文件上生成 start.o 地址从 0 开始,链接过程必须将那里的所有地址重新分配到最终可执行文件中的地址.

为了跟踪需要重定位的符号,ELF 存储了这些结构体,如答案所述:

typedef 结构体{Elf64_Addr r_offset;/* 参考地址 */Elf64_Xword r_info;/* 符号索引和重定位类型 */} Elf64_Rel;类型定义结构{Elf64_Addr r_offset;/* 参考地址 */Elf64_Xword r_info;/* 符号索引和重定位类型 */Elf64_Sxword r_addend;/* 表达式的常量部分 */} Elf64_Rela;

例如,r_offset 将在最终的可执行文件中存储 zbi_paddr 的地址.然后,当程序加载时,加载器会查看这些结构体,然后从 C++ 代码中填充 zbi_paddr 的地址.

在那之后,我完全忽略了对诸如 SAPXabs_g0_slo12.他说这与无法将 64 位插入寄存器的指令有关.有人可以给我更多的上下文吗?我不明白,已经有方法可以将 64 位插入寄存器.这与重新分配有何关系?

解决方案

底层问题是 ARM64 指令的大小都是 32 位,这限制了可以在任何一条指令中编码的立即数的位数.你当然不能编码 64 位的地址,甚至 32 位.

内核的代码和静态数据可以预期在4GB以下,所以为了将数据存储在静态变量zbi_paddr中,程序员可以编写如下两条指令(包括您省略了前一个但至关重要).注意tmp是上面定义为x9的宏,所以代码扩展为:

adrp x9, zbi_paddrstr x0, [x9, #:lo12:zbi_paddr]

现在当链接发生时,链接器将知道整个内核的布局,以及所有符号的相对位置.这个方案支持位置无关代码,所以绝对地址不需要知道,但是我们肯定知道zbi_paddradrp之间的位移上面的指令,它将适合一个有符号的 32 位值,以及 zbi_paddr 在其 4KB 页面内的偏移量(因为内核必须加载到页面对齐的地址).>

所以这个位移的第 12 位和更高的位将被编码到 adrp 指令中,该指令有一个 21 位的立即数字段.adrp 将对其进行符号扩展,将其添加到程序计数器的相应位,并将结果放入x9.然后 x9 将包含 zbi_paddr 绝对地址的第 63-12 位,低 12 位为零.

zbi_paddr 在其页面内的 12 位偏移量将被编码到 str 指令的 12 位立即数字段中.它将这个立即数添加到 x9 中的值,然后将产生 zbi_paddr 的地址,并将 x0 存储在该地址处.所以我们只用两条指令就成功地在 zbi_paddr 中存储了一个值.

为了支持这一点,通过组装我们的代码产生的目标文件需要指示链接器将位移的第 32-12 位插入到 adrp 指令中,并将第 11-0 位插入到adrp 指令中.zbi_paddr 的地址需要插入到str 指令中.这些给链接器的指令就是重定位;它们将包含对要对其地址进行编码的符号的引用(此处为 zbi_paddr)以及要对其进行具体处理的内容.ELF 支持专为这些指令设计的重定位,将正确的位放在指令字的正确位置.

确实有其他方法可以将 64 位值放入寄存器.例如,它可以放置在文字池中,这是一个足够接近相应代码的数据区域,可以通过单个 ldr 指令(具有 PC 相对位移)访问它.您可以通过重定位告诉链接器在文字池中插入 zbi_paddr 的绝对地址.但是加载它需要额外的内存访问,这比adrp要慢;此外,8 个字节的字面量,加上 ldr,再加上实际进行存储的 str,总共需要 16 个字节的内存.adrp/str 方法只需要 8 个,它适用于位置无关代码,链接器实际上可能不知道 zbi_paddr 的绝对地址.

如果不喜欢从内存中加载,可以将zbi_paddr的绝对地址拿到一个寄存器中,最多有4条mov/movk指令,加载16条一次位.也有搬迁.但是对于最终的 str,我们使用了多达 20 个字节的代码;执行 5 条指令比执行 2 条指令需要更多的时钟周期;而且位置无关代码还是有问题.

因此,adrp/str:lo12: 是访问全局或静态变量的标准方法.如果要加载而不是存储,请使用 adrp/ldr.如果你想要寄存器中 zbi_paddr 的地址,你可以

adrp x9, zbi_paddr添加 x9, x9, #:lo12:zbi_paddr

add 指令也支持 12 位立即数,正是为此目的.

这些功能在 GNU 汇编器手册中有解释.

I found this line of assembly in zircon kernel start.S

str     x0, [tmp, #:lo12:zbi_paddr]

for ARM64. I also found that zbi_paddr is defined in C++:

extern paddr_t zbi_paddr;

So I started looking about what does #:lo12: mean.

I found https://stackoverflow.com/a/38608738/6655884 which looks like a great explanation, but it does not explain the very basic: what is a rellocation and why some things are needed.

I guess that since zbi_paddrr is defined in start.S and used in C++ code, since start.S generates on object file start.o with addresses starting at 0, the linking process will have to reallocate all addresses there to addresses in the final executable file.

In order to keep track of the symbols that need rellocation, ELF stores these structs, as said in the answer:

typedef struct
{
    Elf64_Addr r_offset;    /* Address of reference */
    Elf64_Xword r_info;     /* Symbol index and type of relocation */
} Elf64_Rel;

typedef struct
{
    Elf64_Addr r_offset;    /* Address of reference */
    Elf64_Xword r_info;     /* Symbol index and type of relocation */
    Elf64_Sxword r_addend;  /* Constant part of expression */
} Elf64_Rela;

So for example, r_offset would store the address of zbi_paddr in the final executable. Then, when the program is loaded, the loader looks on these structs and then fills the address of zbi_paddr from the C++ code.

After that I completely missed the need for those things like S, A, P, X and abs_g0_s and lo12. He says it's related to instructions not being able to insert 64 bits into registers. Can someone give me more context? I can't understand, there are already ways to insert 64 bits into registers. And how this is related to reallocation?

解决方案

The underlying issue is that ARM64 instructions are all 32 bits in size, which limits the number of bits of immediate data that can be encoded in any one instruction. You certainly cannot encode 64 bits of address, or even 32 bits.

The code and static data of the kernel can be expected to be well under 4 GB, so in order to store data in the static variable zbi_paddr, the programmer can write the following two instructions (including the preceding one which you omitted but is crucial). Note that tmp is a macro defined above as x9, so the code expands to:

adrp    x9, zbi_paddr
str     x0, [x9, #:lo12:zbi_paddr]

Now when linking occurs, the linker will know the layout of the entire kernel, and the relative locations of all symbols. This scheme supports position-independent code, so the absolute addresses need not be known, but we will certainly know the displacement between zbi_paddr and the adrp instruction above, which will fit in a signed 32-bit value, as well as the offset of zbi_paddr within its 4KB page (since the kernel will necessarily be loaded at a page-aligned address).

So bits 12 and higher of this displacement will be encoded into the adrp instruction, which has a 21-bit immediate field. adrp will sign-extend it, add it to the corresponding bits of the program counter, and place the result in x9. Then x9 will contain bits 63-12 of the absolute address of zbi_paddr, with the low 12 bits being zeroed.

The 12-bit offset of zbi_paddr within its page will be encoded into the 12-bit immediate field of the str instruction. It adds this immediate to the value in x9, which will then yield the address of zbi_paddr, and it stores x0 at that address. So we have managed to store a value in zbi_paddr with just two instructions.

To support this, the object file produced by assembling our code needs to instruct the linker that bits 32-12 of the displacement need to be inserted into the adrp instruction, and bits 11-0 of the address of zbi_paddr need to be inserted into the str instruction. These instructions to the linker are what relocations are; they'll contain a reference to the symbol whose address is to be encoded (here zbi_paddr) and what specifically is to be done with it. ELF supports relocations specifically designed for these instructions, that put just the right bits in the right place in the instruction word.

It's true that there are other ways to get a 64-bit value into a register. For instance, it can be placed in the literal pool, which is an area of data close enough to the corresponding code that it can be reached with a single ldr instruction (with PC-relative displacement). You could have a relocation telling the linker to insert the absolute address of zbi_paddr in the literal pool. But loading it requires an additional memory access, which is slower than adrp; moreover, the 8 bytes of literal, plus the ldr, plus the str to actually do the store, add up to a total of 16 bytes of memory needed. The adrp/str approach only needs 8, and it works better with position-independent code, where the linker may not actually know the absolute address of zbi_paddr.

If you don't like the load from memory, you can get the absolute address of zbi_paddr into a register with up to four mov/movk instructions, loading 16 bits at a time. There are relocations for that, too. But with the final str, we are using up to 20 bytes of code; executing five instructions takes more clock cycles than two; and there's still a problem with position-independent code.

As such, adrp/str, with :lo12: as noted, is the standard accepted method for accessing a global or static variable. If you want to load instead of store, you use adrp/ldr. And if you want the address of zbi_paddr in a register, you do

adrp x9, zbi_paddr
add x9, x9, #:lo12:zbi_paddr

The add instruction also supports a 12-bit immediate, precisely for this purpose.

These features are explained in the GNU assembler manual.

这篇关于了解 ARM 重定位(例如:str x0, [tmp, #:lo12:zbi_paddr])的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆