为什么静态变量的地址是相对于指令指针的? [英] Why is the address of static variables relative to the Instruction Pointer?

查看:16
本文介绍了为什么静态变量的地址是相对于指令指针的?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在关注关于汇编的本教程.

根据教程(我在本地也试过,也有类似的结果),源码如下:

<块引用>

int natural_generator(){int a = 1;静态 int b = -1;b += 1;/* (1, 2) */返回 a + b;}

编译为这些汇编指令:

<块引用>

$ gdb 静态(gdb) break natural_generator(gdb) 运行(gdb) 反汇编函数 natural_generator 的汇编代码转储:推 %rbp移动 %rsp,%rbpmovl $0x1,-0x4(%rbp)mov 0x177(%rip),%eax # (1)添加 $0x1,%eaxmov %eax,0x16c(%rip) # (2)mov -0x4(%rbp),%eax添加 0x163(%rip),%eax # 0x100001018 <natural_generator.b>弹出 %rbp回复汇编程序转储结束.

(行号注释(1), (2)(1, 2) 由我添加.)

问题:为什么是,在编译后的代码中,静态变量b相对于指令指针(RIP)的地址,它不断变化(参见 (1)(2) 行),从而生成更复杂的汇编代码,而不是与可执行文件的特定部分相关,其中存储这样的变量?

根据提到的教程,这样的部分:

<块引用>

这是因为 b 的值被硬编码在 不同部分示例可执行文件,它与所有当进程运行时操作系统加载器的机器码启动.

(强调我的.)

解决方案

使用 RIP 相对寻址来访问静态变量 b 的主要原因有两个.第一个是它使代码位置独立,这意味着如果它用于共享库或 位置独立的可执行文件 代码可以更容易地重新定位.第二个是它允许将代码加载到 64 位地址空间中的任何位置,而无需在指令中编码巨大的 8 字节(64 位)位移,而 64 位 x86 CPU 无论如何都不支持这些位移.

您提到编译器可以改为生成引用变量的代码,该代码相对于它所在部分的开头.虽然这样做确实具有与上述相同的优点,但它不会使程序集变得任何不那么复杂.事实上,它会使事情变得更加复杂.生成的汇编代码首先必须计算变量所在部分的地址,因为它只知道它相对于指令指针的位置.然后它必须将其存储在寄存器中,因此可以相对于该地址访问 b(以及该部分中的任何其他变量).

由于 32 位 x86 代码不支持 RIP 相对寻址,您的替代解决方案是编译器在生成 32 位位置无关代码时所做的事情.它将变量 b 放在全局偏移表 (GOT) 中,然后访问相对于 GOT 基址的变量.这是使用 gcc -m32 -O3 -fPIC -S test.c 编译时由您的代码生成的程序集:

natural_generator:调用 __x86.get_pc_thunk.cxaddl $_GLOBAL_OFFSET_TABLE_, %ecxmovl b.1392@GOTOFF(%ecx), %eaxleal 1(%eax), %edxaddl $2, %eaxmovl %edx, b.1392@GOTOFF(%ecx)退

第一个函数调用将以下指令的地址放置在 ECX 中.下一条指令通过加上GOT从指令开始处的相对偏移量来计算GOT的地址.变量 ECX 现在包含 GOT 的地址,并在其余代码中访问变量 b 时用作基址.

将其与 gcc -m64 -O3 -S test.c 生成的 64 位代码进行比较:

natural_generator:movl b.1745(%rip), %eaxleal 1(%rax), %edxaddl $2, %eaxmovl %edx, b.1745(%rip)退

(代码与您问题中的示例不同,因为优化已打开.一般来说,只查看优化的输出是个好主意,因为没有优化,编译器通常会生成糟糕的代码,这些代码会做很多无用的事情.另请注意,不需要使用 -fPIC 标志,因为无论如何编译器都会生成 64 位位置无关代码.)

请注意在 64 位版本中如何少了两条汇编指令,使其成为不太复杂的版本.您还可以看到代码少使用了一个寄存器 (ECX).虽然它对您的代码没有太大影响,但在更复杂的示例中,它是一个可以用于其他用途的寄存器.这使得代码更加复杂,因为编译器需要对寄存器进行更多处理.

I am following this tutorial about assembly.

According to the tutorial (which I also tried locally, and got similar results), the following source code:

int natural_generator()
{
        int a = 1;
        static int b = -1;
        b += 1;              /* (1, 2) */
        return a + b;
}

Compiles to these assembly instructions:

$ gdb static
(gdb) break natural_generator
(gdb) run
(gdb) disassemble
Dump of assembler code for function natural_generator:
push   %rbp
mov    %rsp,%rbp
movl   $0x1,-0x4(%rbp)
mov    0x177(%rip),%eax        # (1)
add    $0x1,%eax
mov    %eax,0x16c(%rip)        # (2)
mov    -0x4(%rbp),%eax
add    0x163(%rip),%eax        # 0x100001018 <natural_generator.b>
pop    %rbp
retq   
End of assembler dump.

(Line number comments (1), (2) and (1, 2) added by me.)

Question: why is, in the compiled code, the address of the static variable b relative to the instruction pointer (RIP), which constantly changes (see lines (1) and (2)), and thus generates more complicated assembly code, rather than being relative to the specific section of the executable, where such variables are stored?

According to the mentioned tutorial, there is such a section:

This is because the value for b is hardcoded in a different section of the sample executable, and it’s loaded into memory along with all the machine code by the operating system’s loader when the process is launched.

(Emphasis mine.)

解决方案

There are two main reasons why RIP-relative addressing is used to access the static variable b. The first is that it makes the code position independent, meaning if it's used in a shared library or position independent executable the code can be more easily relocated. The second is that it allows the code to be loaded anywhere in the 64-bit address space without requiring huge 8 byte (64-bit) displacements to be encoded in the instruction, which aren't supported by 64-bit x86 CPUs anyways.

You mention that the compiler could instead generate code that referenced the variable relative to the beginning of the section it lives in. While its true doing this would also have the same advantages as given above, it wouldn't make the assembly any less complicated. In fact it will make it more complicated. The generated assembly code would first have to calculate the address of the section the variable lives in, since it would only know its location relative to the instruction pointer. It would then have to store it in a register, so accesses to b (and any other variables in the section) can be made relative to that address.

Since 32-bit x86 code doesn't support RIP-relative addressing, your alternate solution is fact what the compiler does when generating 32-bit position independent code. It places the variable b in the global offset table (GOT), and then accesses the variable relative to the base of the GOT. Here's the assembly generated by your code when compiled with gcc -m32 -O3 -fPIC -S test.c:

natural_generator:
        call    __x86.get_pc_thunk.cx
        addl    $_GLOBAL_OFFSET_TABLE_, %ecx
        movl    b.1392@GOTOFF(%ecx), %eax
        leal    1(%eax), %edx
        addl    $2, %eax
        movl    %edx, b.1392@GOTOFF(%ecx)
        ret

The first function call places the address of the following instruction in ECX. The next instruction calculates the address of the GOT by adding the relative offset of the GOT from the start of the instruction. The variable ECX now contains the address of the GOT and is used as a base when accessing the variable b in the rest of the code.

Compare that to 64-bit code generated by gcc -m64 -O3 -S test.c:

natural_generator:
        movl    b.1745(%rip), %eax
        leal    1(%rax), %edx
        addl    $2, %eax
        movl    %edx, b.1745(%rip)
        ret

(The code is different than the example in your question because optimization is turned on. In general its a good idea to only look at optimized output, as without optimization the compiler often generates terrible code that does a lot of useless things. Also note that the -fPIC flag doesn't need to be used, as the compiler generates 64-bit position independent code regardless.)

Notice how there's two fewer assembly instructions in the 64-bit version making it the less complicated version. You can also see that the code uses one less register (ECX). While it doesn't make much of a difference in your code, in a more complicated example that's a register that could've been used for something else. That makes the code even more complicated as the compiler needs to do more juggling of registers.

这篇关于为什么静态变量的地址是相对于指令指针的?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆