函数指针局部变量的意外值 [英] Unexpected value of a function pointer local variable

查看:125
本文介绍了函数指针局部变量的意外值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我做了一些实验,在其中创建了一个指向类型为printf的函数的指针类型的局部变量.然后我定期调用printf并按如下方式使用该变量:

#include<stdio.h>
typedef int (*func)(const char*,...);

int main()
{
        func x=printf;
        printf("%p\n", x);
        x("%p\n", x);
        return 0;
}

我已经编译了它,并使用gdb查看了main的反汇编,并得到了:

   0x000000000000063a <+0>:     push   %rbp
   0x000000000000063b <+1>:     mov    %rsp,%rbp
   0x000000000000063e <+4>:     sub    $0x10,%rsp
   0x0000000000000642 <+8>:     mov    0x20098f(%rip),%rax        # 0x200fd8
   0x0000000000000649 <+15>:    mov    %rax,-0x8(%rbp)
   0x000000000000064d <+19>:    mov    -0x8(%rbp),%rax
   0x0000000000000651 <+23>:    mov    %rax,%rsi
   0x0000000000000654 <+26>:    lea    0xb9(%rip),%rdi        # 0x714
   0x000000000000065b <+33>:    mov    $0x0,%eax
   0x0000000000000660 <+38>:    callq  0x520 <printf@plt>
   0x0000000000000665 <+43>:    mov    -0x8(%rbp),%rax
   0x0000000000000669 <+47>:    mov    -0x8(%rbp),%rdx
   0x000000000000066d <+51>:    mov    %rax,%rsi
   0x0000000000000670 <+54>:    lea    0x9d(%rip),%rdi        # 0x714
   0x0000000000000677 <+61>:    mov    $0x0,%eax
   0x000000000000067c <+66>:    callq  *%rdx
   0x000000000000067e <+68>:    mov    $0x0,%eax
   0x0000000000000683 <+73>:    leaveq
   0x0000000000000684 <+74>:    retq

对我来说很奇怪的是,调用printf直接使用plt(如预期的那样),但是使用局部变量调用它使用的是一个完全不同的地址(如您在第4行中所见)程序集的说明,存储在局部变量x中的值不是plt条目的地址.

那怎么可能?并非所有对可执行文件中未定义函数的调用都首先通过plt获得更好的性能和图片代码吗?

解决方案

(您可以在程序集的第4行中看到,存储在局部变量x中的值不是plt条目的地址)

嗯? 在反汇编中不可见,仅在其加载位置可见. (实际上,它不会加载指向PLT条目的指针,但是程序集的第4行不会告诉您 1 .)使用objdump -dR查看动态重定位.

这是使用相对RIP寻址模式的内存负载.在这种情况下,它正在加载指向libc中实际printf地址的指针.该指针存储在全局偏移表(GOT)中.

要实现此目的,printf符号将获得早期绑定"而不是惰性动态链接,从而避免了以后使用该函数指针的PLT开销.

注释1:尽管也许您是基于这种事实,而不是相对于RIP的LEA来承担负载.确实可以告诉您,这不是PLT条目;它不是PLT条目. PLT要点的一部分是拥有一个地址,该地址是call rel32的链接时间常数,这也使LEA具有RIP + rel32寻址模式.如果编译器希望在寄存器中使用PLT地址,则将使用该地址.


顺便说一句,PLT存根本身也将GOT条目用于其内存间接跳转;对于仅用作函数调用目标的符号,GOT条目保留指向PLT存根,指向push/jmp指令的指针,该指针调用惰性动态链接器以解析该PLT条目.即更新GOT条目.


并非所有对可执行文件中未定义函数的调用都首先通过plt获得更好的性能

否,PLT通过为每个调用添加额外的间接级别来提高运行时间的性能. gcc -fno-plt使用早期绑定而不是等待第一个呼叫,因此它可以通过GOT将间接call内联到每个呼叫站点中.

PLT的存在是为了避免动态链接期间运行时修正call rel32偏移量.在64位系统上,允许到达2GB以上的地址.并且还支持符号插入.参见 https://www. macieira.org/blog/2012/01/sorry-state-of-dynamic-libraries-on-linux/(在-fno-plt存在之前编写;基本上就像他所建议的想法之一).

与早期绑定相比,PLT的延迟绑定可以提高启动性能,但是在高速缓存命中非常重要的现代系统上,在启动过程中一次完成所有符号扫描工作是很好的.

还有图片代码?

您的代码 PIC,或者实际上是PIE(与位置无关的可执行文件),大多数发行版都将GCC配置为默认执行.

我希望x指向printf

的PLT条目的地址

如果使用-fno-pie ,则PLT条目的地址是链接时常量,并且在编译时,编译器不知道您是否要静态链接libc.或动态地因此,它使用mov $printf, %eax将功能指针的地址获取到寄存器中,并且在链接时只能转换为mov $printf@plt, %eax.

来告诉编译器它绝对不需要通过pie-fno-plt来通过该符号进行GOT操作.

将其保留到链接器在链接时将symbol转换为symbol@plt的链接时(如有必要),可使编译器始终使用有效的32位绝对立即数或RIP相对寻址,并且仅对具有以下功能的函数使用PLT间接寻址:原来是在共享库中.但是随后您将获得指向PLT条目的指针,而不是指向最终地址的指针.


如果您使用的是Intel语法,那么在查看asm而不是反汇编时,它将在GCC的输出中为mov rbp, QWORD PTR printf@GOTPCREL[rip].

查看编译器输出可为您提供更多的信息,这些信息仅是纯objdump输出中RIP的数字偏移量. -r显示重定位符号会有所帮助,但通常编译器输出会更好. (除非您没有看到printf被重写为printf@plt)

I have done some experiments in which I created a local variable of type pointer to function that points to printf. Then I called printf regularly and using that variable as following:

#include<stdio.h>
typedef int (*func)(const char*,...);

int main()
{
        func x=printf;
        printf("%p\n", x);
        x("%p\n", x);
        return 0;
}

I have compiled it and looked at the disassembly of main using gdb and got that:

   0x000000000000063a <+0>:     push   %rbp
   0x000000000000063b <+1>:     mov    %rsp,%rbp
   0x000000000000063e <+4>:     sub    $0x10,%rsp
   0x0000000000000642 <+8>:     mov    0x20098f(%rip),%rax        # 0x200fd8
   0x0000000000000649 <+15>:    mov    %rax,-0x8(%rbp)
   0x000000000000064d <+19>:    mov    -0x8(%rbp),%rax
   0x0000000000000651 <+23>:    mov    %rax,%rsi
   0x0000000000000654 <+26>:    lea    0xb9(%rip),%rdi        # 0x714
   0x000000000000065b <+33>:    mov    $0x0,%eax
   0x0000000000000660 <+38>:    callq  0x520 <printf@plt>
   0x0000000000000665 <+43>:    mov    -0x8(%rbp),%rax
   0x0000000000000669 <+47>:    mov    -0x8(%rbp),%rdx
   0x000000000000066d <+51>:    mov    %rax,%rsi
   0x0000000000000670 <+54>:    lea    0x9d(%rip),%rdi        # 0x714
   0x0000000000000677 <+61>:    mov    $0x0,%eax
   0x000000000000067c <+66>:    callq  *%rdx
   0x000000000000067e <+68>:    mov    $0x0,%eax
   0x0000000000000683 <+73>:    leaveq
   0x0000000000000684 <+74>:    retq

What is weird to me is that calling to printf directly uses the plt (as expected) but calling it using the local variable uses a whole different address (as you can see in line 4 of the assembly that the value stored in local variable x is not the address of the plt entry).

How can that be? Don't all the calls to functions undefined in the executable go first through the plt for better performance and for pic code?

解决方案

(as you can see in line 4 of the assembly that the value stored in local variable x is not the address of the plt entry)

Huh? The value isn't visible in the disassembly, only the location it's loaded from. (In practice it's not loading a pointer to the PLT entry, but line 4 of the assembly doesn't tell you that1.) Use objdump -dR to see dynamic relocations.

That's a load from memory using a RIP-relative addressing mode. In this case it's loading a pointer to the real printf address in libc. That pointer is stored in the Global Offset Table (GOT).

To make this work, the printf symbol gets "early binding" instead of lazy dynamic linking, avoiding PLT overhead for later uses of that function pointer.

Footenote 1: Although maybe you were basing that reasoning on the fact that it's a load instead of a RIP-relative LEA. That pretty much does tell you it's not the PLT entry; part of the point of the PLT is to have an address that's a link-time constant for call rel32, which also enables LEA with a RIP+rel32 addressing mode. The compiler would have used that if it wanted the PLT address in a register.


BTW, the PLT stub itself also uses the GOT entry for its memory-indirect jump; for symbols that are only used as function call targets, the GOT entry holds a pointer back to the PLT stub, to the push / jmp instructions that invoke the lazy dynamic linker to resolve that PLT entry. i.e. to update the GOT entry.


Don't all the calls to functions undefined in the executable go first through the plt for better performance

No, the PLT costs runtime performance by adding an extra level of indirection to every call. gcc -fno-plt uses early binding instead waiting for the first call, so it can inline the indirect call through the GOT right into each call site.

The PLT exists to avoid runtime fixups of call rel32 offsets during dynamic linking. And on 64-bit systems, to allow reaching addresses that are more than 2GB away. And also to support symbol interposition. See https://www.macieira.org/blog/2012/01/sorry-state-of-dynamic-libraries-on-linux/ (written before -fno-plt existed; it's basically like one of the ideas he was suggesting).

The PLT's lazy binding can improve startup performance vs. early binding, but on modern systems where cache hits are very important, doing all the symbol-scanning stuff at once during startup is nice.

and for pic code?

Your code is PIC, or actually PIE (position-independent executable), which most distros configure GCC to do by default.

I expected x to point to the address of the PLT entry of printf

If you use -fno-pie, then the address of the PLT entry is a link-time constant, and at compile time the compiler doesn't know whether you're going to link libc statically or dynamically. So it uses mov $printf, %eax to get the address of a function-pointer into a register, and at link time that can only convert to mov $printf@plt, %eax.

See it on Godbolt. (The Godbolt default is -fno-pie, unlike on most current Linux distros.)

# gcc9.2 -O3 -fpie    for your first block
        movq    printf@GOTPCREL(%rip), %rbp
        leaq    .LC0(%rip), %rdi
        xorl    %eax, %eax
        movq    %rbp, %rsi        # saved for later in rbp
        call    printf@PLT

vs.

# gcc9.2 -O3 -fno-pie
        movl    $printf, %esi          # linker converts this symbol reference to printf@plt
        movl    $.LC0, %edi
        xorl    %eax, %eax
        call    printf                 # will convert at link-time to printf@plt
      # next use also just uses mov-immediate to rematerialize, instead of saving a load result in a register.

So a PIE executable actually has better efficiency for repeated-use of function pointers to functions in standard libraries: the pointer is the final address, not just the PLT entry.

-fno-plt -fno-pie works more like PIE mode for taking function pointers. Except it can still use $foo 32-bit immediates for the addresses of symbols in the same file, instead of a RIP-relative LEA.

# gcc9.2 -O3 -fno-plt -fno-pie
        movq    printf@GOTPCREL(%rip), %rbp    # saved for later in RBP
        movl    $.LC0, %edi
        xorl    %eax, %eax
        movq    %rbp, %rsi
        call    *printf@GOTPCREL(%rip)
  # pointers to static functions can use  mov $foo, %esi

It seems you need int foo(const char*,...) __attribute__((visibility("hidden"))); to tell the compiler it definitely doesn't need to go through the GOT for this symbol, with pie or -fno-plt.

Leaving it until link-time for the linker to convert symbol to symbol@plt if necessary allows the compiler to always use efficient 32-bit absolute immediates or RIP-relative addressing and only end up with PLT indirection for functions that turn out to be in a shared library. But then you end up with pointers to PLT entries, instead of pointers to the final address.


If you were using Intel syntax, it would be mov rbp, QWORD PTR printf@GOTPCREL[rip] in GCC's output for this, if you look at asm instead of disassembly.

Looking at compiler output gives you significantly more information that just numeric offsets from RIP in plain objdump output. -r to show relocation symbols helps some, but compiler output is generally better. (Except you don't see that printf gets rewritten to printf@plt)

这篇关于函数指针局部变量的意外值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆