了解ARM重定位(示例:str x0,[tmp,#:lo12:zbi_paddr]) [英] Understanding ARM relocation (example: str x0, [tmp, #:lo12:zbi_paddr])

查看:154
本文介绍了了解ARM重定位(示例:str x0,[tmp,#:lo12:zbi_paddr])的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在锆石内核启动.S

  str x0,[tmp,#:lo12:zbi_paddr] 

对于ARM64.我还发现 zbi_paddr 是用C ++定义的:

  extern paddr_t zbi_paddr; 

所以我开始研究#:lo12:的含义.

我发现 https://stackoverflow.com/a/38608738/6655884 看起来很不错,但它没有解释最基本的内容:什么是重新分配以及为什么需要某些东西.

我猜想因为 zbi_paddrr 是在 start.S 中定义的,并且用在C ++代码中,因为 start.S 是在目标文件 start.o (地址从0开始),链接过程将不得不将那里的所有地址重新分配给最终可执行文件中的地址.

为了跟踪需要重定位的符号,ELF存储了这些结构,如答案中所述:

  typedef结构{Elf64_Addr r_offset;/*参考地址*/Elf64_Xword r_info;/*符号索引和重定位类型*/} Elf64_Rel;类型定义结构{Elf64_Addr r_offset;/*参考地址*/Elf64_Xword r_info;/*符号索引和重定位类型*/Elf64_Sxword r_addend;/*表达式的常数*/} Elf64_Rela; 

例如, r_offset 会将 zbi_paddr 的地址存储在最终可执行文件中.然后,在加载程序时,加载程序将查找这些结构,然后从C ++代码填充 zbi_paddr 的地址.

此后,我完全错过了对 S A P X abs_g0_s lo12 .他说,这与指令无法将64位插入寄存器有关.有人可以给我更多背景信息吗?我不明白,已经有办法将64位插入寄存器.以及这与重新分配有何关系?

解决方案

潜在的问题是ARM64指令的大小均为32位,这限制了可以在任何一条指令中编码的立即数据的位数.您当然不能对64位地址甚至32位地址进行编码.

内核的代码和静态数据可以期望在4 GB以下,因此,为了将数据存储在静态变量 zbi_paddr 中,程序员可以编写以下两条指令(包括您忽略但至关重要的前一个).请注意, tmp 是上面定义为 x9 的宏,因此代码扩展为:

  adrp x9,zbi_paddrstr x0,[x9,#:lo12:zbi_paddr] 

现在,当发生链接时,链接器将知道整个内核的布局以及所有符号的相对位置.该方案支持与位置无关的代码,因此不需要知道绝对地址,但是我们肯定会知道 zbi_paddr adrp 之间的位移上面的指令,该指令将适合带符号的32位值以及 zbi_paddr 在其4KB页面内的偏移量(因为内核必须在与页面对齐的地址处加载).

因此,此位移的第12位及更高位将被编码为 adrp 指令,该指令具有21位立即数字段. adrp 将对其进行符号扩展,将其添加到程序计数器的相应位,然后将结果放入 x9 中.然后 x9 将包含 zbi_paddr 的绝对地址的第63-12位,其中低12位被清零.

zbi_paddr 在其页面中的12位偏移量将被编码到 str 指令的12位立即数字段中.它将立即数添加到 x9 中的值,然后将产生 zbi_paddr 的地址,并将 x0 存储在该地址.因此,我们仅用两条指令就可以将值存储在 zbi_paddr 中.

为了支持这一点,通过汇编我们的代码生成的目标文件需要指示链接器,位移的第32-12位需要插入到 adrp 指令中,而第11-0位应位于需要将 zbi_paddr 的地址插入 str 指令中.这些给链接器的指令是什么是重定位.它们将包含对要对其地址进行编码的符号的引用(在此处为 zbi_paddr ),以及要具体执行的操作.ELF支持专门为这些指令设计的重定位,该重定位仅将正确的位放在指令字的正确位置.

确实,还有其他方法可以将64位值存储到寄存器中.例如,可以将其放置在文字池中,这是与相应代码足够接近的数据区域,可以通过单个 ldr 指令(相对于PC的位移)来访问它.您可能有一个重定位,告诉链接器在文字池中插入 zbi_paddr 的绝对地址.但是加载它需要额外的内存访问,这比 adrp 慢;此外,实际要存储的8个字节的文字,再加上 ldr ,再加上 str ,总共需要16个字节的内存. adrp/str 方法仅需要8个,并且与位置无关的代码配合使用效果更好,在这种情况下,链接器可能实际上不知道 zbi_paddr 的绝对地址.

如果您不喜欢从内存中进行加载,则可以将最多包含四个 mov/movk 指令的 zbi_paddr 的绝对地址获取到寄存器中,从而加载16一次.也有相应的重定位.但是,使用最后的 str ,我们将使用多达20个字节的代码.执行五条指令所花费的时钟周期要多于两个;而且与位置无关的代码仍然存在问题.

因此,如前所述,带有:lo12: adrp/str 是访问全局或静态变量的标准接受方法.如果要加载而不是存储,请使用 adrp/ldr .而且,如果您要在寄存器中添加 zbi_paddr 的地址,则可以

  adrp x9,zbi_paddr添加x9,x9,#:lo12:zbi_paddr 

正是出于这个目的, add 指令还支持12位立即数.

这些功能在 GNU汇编程序手册中进行了解释./p>

I found this line of assembly in zircon kernel start.S

str     x0, [tmp, #:lo12:zbi_paddr]

for ARM64. I also found that zbi_paddr is defined in C++:

extern paddr_t zbi_paddr;

So I started looking about what does #:lo12: mean.

I found https://stackoverflow.com/a/38608738/6655884 which looks like a great explanation, but it does not explain the very basic: what is a rellocation and why some things are needed.

I guess that since zbi_paddrr is defined in start.S and used in C++ code, since start.S generates on object file start.o with addresses starting at 0, the linking process will have to reallocate all addresses there to addresses in the final executable file.

In order to keep track of the symbols that need rellocation, ELF stores these structs, as said in the answer:

typedef struct
{
    Elf64_Addr r_offset;    /* Address of reference */
    Elf64_Xword r_info;     /* Symbol index and type of relocation */
} Elf64_Rel;

typedef struct
{
    Elf64_Addr r_offset;    /* Address of reference */
    Elf64_Xword r_info;     /* Symbol index and type of relocation */
    Elf64_Sxword r_addend;  /* Constant part of expression */
} Elf64_Rela;

So for example, r_offset would store the address of zbi_paddr in the final executable. Then, when the program is loaded, the loader looks on these structs and then fills the address of zbi_paddr from the C++ code.

After that I completely missed the need for those things like S, A, P, X and abs_g0_s and lo12. He says it's related to instructions not being able to insert 64 bits into registers. Can someone give me more context? I can't understand, there are already ways to insert 64 bits into registers. And how this is related to reallocation?

解决方案

The underlying issue is that ARM64 instructions are all 32 bits in size, which limits the number of bits of immediate data that can be encoded in any one instruction. You certainly cannot encode 64 bits of address, or even 32 bits.

The code and static data of the kernel can be expected to be well under 4 GB, so in order to store data in the static variable zbi_paddr, the programmer can write the following two instructions (including the preceding one which you omitted but is crucial). Note that tmp is a macro defined above as x9, so the code expands to:

adrp    x9, zbi_paddr
str     x0, [x9, #:lo12:zbi_paddr]

Now when linking occurs, the linker will know the layout of the entire kernel, and the relative locations of all symbols. This scheme supports position-independent code, so the absolute addresses need not be known, but we will certainly know the displacement between zbi_paddr and the adrp instruction above, which will fit in a signed 32-bit value, as well as the offset of zbi_paddr within its 4KB page (since the kernel will necessarily be loaded at a page-aligned address).

So bits 12 and higher of this displacement will be encoded into the adrp instruction, which has a 21-bit immediate field. adrp will sign-extend it, add it to the corresponding bits of the program counter, and place the result in x9. Then x9 will contain bits 63-12 of the absolute address of zbi_paddr, with the low 12 bits being zeroed.

The 12-bit offset of zbi_paddr within its page will be encoded into the 12-bit immediate field of the str instruction. It adds this immediate to the value in x9, which will then yield the address of zbi_paddr, and it stores x0 at that address. So we have managed to store a value in zbi_paddr with just two instructions.

To support this, the object file produced by assembling our code needs to instruct the linker that bits 32-12 of the displacement need to be inserted into the adrp instruction, and bits 11-0 of the address of zbi_paddr need to be inserted into the str instruction. These instructions to the linker are what relocations are; they'll contain a reference to the symbol whose address is to be encoded (here zbi_paddr) and what specifically is to be done with it. ELF supports relocations specifically designed for these instructions, that put just the right bits in the right place in the instruction word.

It's true that there are other ways to get a 64-bit value into a register. For instance, it can be placed in the literal pool, which is an area of data close enough to the corresponding code that it can be reached with a single ldr instruction (with PC-relative displacement). You could have a relocation telling the linker to insert the absolute address of zbi_paddr in the literal pool. But loading it requires an additional memory access, which is slower than adrp; moreover, the 8 bytes of literal, plus the ldr, plus the str to actually do the store, add up to a total of 16 bytes of memory needed. The adrp/str approach only needs 8, and it works better with position-independent code, where the linker may not actually know the absolute address of zbi_paddr.

If you don't like the load from memory, you can get the absolute address of zbi_paddr into a register with up to four mov/movk instructions, loading 16 bits at a time. There are relocations for that, too. But with the final str, we are using up to 20 bytes of code; executing five instructions takes more clock cycles than two; and there's still a problem with position-independent code.

As such, adrp/str, with :lo12: as noted, is the standard accepted method for accessing a global or static variable. If you want to load instead of store, you use adrp/ldr. And if you want the address of zbi_paddr in a register, you do

adrp x9, zbi_paddr
add x9, x9, #:lo12:zbi_paddr

The add instruction also supports a 12-bit immediate, precisely for this purpose.

These features are explained in the GNU assembler manual.

这篇关于了解ARM重定位(示例:str x0,[tmp,#:lo12:zbi_paddr])的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆