为什么相对于指令指针访问x86-64中的全局变量? [英] Why are global variables in x86-64 accessed relative to the instruction pointer?

查看:133
本文介绍了为什么相对于指令指针访问x86-64中的全局变量?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我尝试使用gcc -S -fasm foo.c将C代码编译为汇编代码. C代码声明全局变量和主函数中的变量,如下所示:

int y=6;
int main()
{
        int x=4;
        x=x+y;
        return 0;
}

现在,我查看了从此C代码生成的汇编代码,发现全局变量y是使用rip指令指针的值存储的.

我认为只有const全局变量存储在文本段中,但是在此示例中,似乎常规的全局变量也存储在文本段中,这很奇怪.

我想我所做的某些假设是错误的,所以有人可以向我解释一下吗?

由c编译器生成的汇编代码:

        .file   "foo.c"
        .text
        .globl  y
        .data
        .align 4
        .type   y, @object
        .size   y, 4
y:
        .long   6
        .text
        .globl  main
        .type   main, @function

main:
.LFB0:
        .cfi_startproc
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset 6, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register 6
        movl    $4, -4(%rbp)
        movl    y(%rip), %eax
        addl    %eax, -4(%rbp)
        movl    $0, %eax
        popq    %rbp
        .cfi_def_cfa 7, 8
        ret
        .cfi_endproc
.LFE0:

解决方案

可执行文件不同部分之间的偏移量是链接时间常数,因此RIP相对寻址可用于任何部分(包括.data表示您的非const全局变量所在的位置.请注意您的asm输出中的.data.

这甚至适用于PIE可执行文件或共享库,在这些库中,直到运行时(ASLR)为止都不知道绝对地址.

与位置无关的可执行文件(PIE)的运行时ASLR将整个程序的一个基址随机化,而不是相对于彼此的单个段起始地址.

所有对静态变量的访问都使用RIP相对寻址,因为这是最有效的,即使在位置依赖的可执行文件中也可以使用绝对寻址(因为静态代码/数据的绝对地址是link-时间常数,不能通过动态链接进行重新定位.


相关,并且可能重复:


在32位x86中,有两种冗余方式可以对没有寄存器和disp32绝对地址的寻址模式进行编码. (有和没有SIB字节). x86-64将较短的一个重新命名为RIP+rel32,因此mov foo, %eaxmov foo(%rip), %eax长1个字节.

64位绝对寻址会占用更多空间,并且仅适用于mov往返RAX/EAX/AX/AL的地址,除非您首先使用单独的指令将地址放入寄存器.

(在x86-64 Linux PIE/PIC中,允许使用64位绝对寻址,并通过加载时修复程序进行处理,以将正确的地址放入代码或跳转表或静态初始化的函数指针中.因此,代码不会技术上必须与位置无关,但通常效率更高,并且不允许32位绝对寻址,因为ASLR不仅限于虚拟地址空间的低31位)


请注意,在非PIE Linux可执行文件中, gcc将使用32位绝对寻址将静态数据的地址放入寄存器中.例如puts("hello");通常会编译为

mov   $.LC0, %edi     # mov r32, imm32
call  puts

在默认的非PIE内存模型中,静态代码和数据被链接到虚拟地址空间的低32位,因此无论32位绝对地址是零扩展还是符号扩展为64位,它都可以工作.这对于索引静态数组也很方便,例如mov array(%rax), %edx;例如add $4, %eax.

请参见 32位绝对地址可以在x86-64 Linux中获得更长的使用时间?,有关PIE可执行文件的更多信息,该文件对所有内容都使用与位置无关的代码,包括相对于RIP的LEA,例如7字节lea .LC0(%rip), %rdi而不是5字节mov $.LC0, %edi. >

我之所以提到Linux,是因为它从.cfi指令中看起来就像您正在为非Windows平台进行编译.

I have tried to compile c code to assembly code using gcc -S -fasm foo.c. The c code declare global variable and variable in the main function as shown below:

int y=6;
int main()
{
        int x=4;
        x=x+y;
        return 0;
}

now I looked in the assembly code that has been generated from this C code and I saw, that the global variable y is stored using the value of the rip instruction pointer.

I thought that only const global variable stored in the text segment but, looking at this example it seems that also regular global variables are stored in the text segment which is very weird.

I guess that some assumption i made is wrong, so can someone please explain it to me?

the assembly code generated by c compiler:

        .file   "foo.c"
        .text
        .globl  y
        .data
        .align 4
        .type   y, @object
        .size   y, 4
y:
        .long   6
        .text
        .globl  main
        .type   main, @function

main:
.LFB0:
        .cfi_startproc
        pushq   %rbp
        .cfi_def_cfa_offset 16
        .cfi_offset 6, -16
        movq    %rsp, %rbp
        .cfi_def_cfa_register 6
        movl    $4, -4(%rbp)
        movl    y(%rip), %eax
        addl    %eax, -4(%rbp)
        movl    $0, %eax
        popq    %rbp
        .cfi_def_cfa 7, 8
        ret
        .cfi_endproc
.LFE0:

解决方案

The offsets between different sections of your executable are link-time constants, so RIP-relative addressing is usable for any section (including .data where your non-const globals are). Note the .data in your asm output.

This applies even in a PIE executable or shared library, where the absolute addresses are not known until runtime (ASLR).

Runtime ASLR for position-independent executables (PIE) randomizes one base address for the entire program, not individual segment start addresses relative to each other.

All access to static variables uses RIP-relative addressing because that's most efficient, even in a position-dependent executable where absolute addressing is an option (because absolute addresses of static code/data are link-time constants, not relocated by dynamic linking).


Related and maybe duplicates:


In 32-bit x86, there are 2 redundant ways to encode an addressing mode with no registers and a disp32 absolute address. (With and without a SIB byte). x86-64 repurposed the shorter one as RIP+rel32, so mov foo, %eax is 1 byte longer than mov foo(%rip), %eax.

64-bit absolute addressing would take even more space, and is only available for mov to/from RAX/EAX/AX/AL unless you use a separate instruction to get the address into a register first.

(In x86-64 Linux PIE/PIC, 64-bit absolute addressing is allowed, and handled via load-time fixups to put the right address into the code or jump table or statically-initialized function pointer. So code doesn't technically have to be position-independent, but normally it's more efficient to be. And 32-bit absolute addressing isn't allowed, because ASLR isn't limited to the low 31 bits of virtual address space.)


Note that in a non-PIE Linux executable, gcc will use 32-bit absolute addressing for putting the address of static data in a register. e.g. puts("hello"); will typically compile as

mov   $.LC0, %edi     # mov r32, imm32
call  puts

In the default non-PIE memory model, static code and data get linked into the low 32 bits of virtual address space, so 32-bit absolute addresses work whether they're zero- or sign-extended to 64-bit. This is handy for indexing static arrays, too, like mov array(%rax), %edx ; add $4, %eax for example.

See 32-bit absolute addresses no longer allowed in x86-64 Linux? for more about PIE executables, which use position-independent code for everything, including RIP-relative LEA like 7-byte lea .LC0(%rip), %rdi instead of 5-byte mov $.LC0, %edi.

I mention Linux because it looks from the .cfi directives like you're compiling for a non-Windows platform.

这篇关于为什么相对于指令指针访问x86-64中的全局变量?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆