x86_64的:是否有可能"在线替代" PLT / GOT参考? [英] x86_64: Is it possible to "in-line substitute" PLT/GOT references?

查看:379
本文介绍了x86_64的:是否有可能"在线替代" PLT / GOT参考?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我不知道这个问题的一个很好的主题是什么,但在这里我们去...

I'm not sure what a good subject line for this question is, but here we go ...

为了迫使code局部性/紧凑性为code的一个关键部分,我正在寻找一种方法,通过一个跳槽来称呼在外部(动态加载)库中的函数(一个ELF R_X86_64_JUMP_SLOT 搬迁)直接在调用点 - 什么连接器通常把成PLT / GOT,但这些右内联在调用点

In order to force code locality / compactness for a critical section of code, I'm looking for a way to call a function in an external (dynamically-loaded) library through a "jump slot" (an ELF R_X86_64_JUMP_SLOT relocation) directly at the call site - what the linker ordinarily puts into PLT / GOT, but have these inlined right at the call site.

如果我像模拟电话:

#include <stdio.h>
int main(int argc, char **argv)
{
        asm ("push $1f\n\t"
             "jmp *0f\n\t"
             "0: .quad %P0\n"
             "1:\n\t"
             : : "i"(printf), "D"("Hello, World!\n"));
        return 0;
}


以获得空间为64位字,调用本身的工作(请,没有这是机缘巧合的意见,因为这打破某些规则ABI - 所有这些都不会受到这个问题的......,可以,我的情况,被工作在其他方面左右/解决,我试图保持这个例子简单)。

to get the space for a 64bit word, the call itself works (please, no comments about this being lucky coincidence as this breaks certain ABI rules - all these are not subject of this question ... and can, for my case, be worked around/addressed in other ways, I'm trying to keep this example brief).

创建以下组件:

0000000000000000 <main>:
   0:   bf 00 00 00 00          mov    $0x0,%edi
                        1: R_X86_64_32  .rodata.str1.1
   5:   68 00 00 00 00          pushq  $0x0
                        6: R_X86_64_32  .text+0x19
   a:   ff 24 25 00 00 00 00    jmpq   *0x0
                        d: R_X86_64_32S .text+0x11
        ...
                        11: R_X86_64_64 printf
  19:   31 c0                   xor    %eax,%eax
  1b:   c3                      retq


但是(?由于使用的printf 作为直接的,我猜...)这里的目标地址仍然是该PLT钩 - 同样的 R_X86_64_64 RELOC。链接对libc中的目标文件到实际的可执行结果:

But (due to using printf as the immediate, I guess ... ?) the target address here is still that of the PLT hook - the same R_X86_64_64 reloc. Linking the object file against libc into an actual executable results in:

0000000000400428 <printf@plt>:
  400428:       ff 25 92 04 10 00       jmpq   *1049746(%rip)        # 5008c0 <_GLOBAL_OFFSET_TABLE_+0x20>
[ ... ]
0000000000400500 <main>:
  400500:       bf 0c 06 40 00          mov    $0x40060c,%edi
  400505:       68 19 05 40 00          pushq  $0x400519
  40050a:       ff 24 25 11 05 40 00    jmpq   *0x400511
  400511:       [ .quad 400428 ]
  400519:       31 c0                   xorl   %eax, %eax
  40051b:       c3                      retq
[ ... ]
DYNAMIC RELOCATION RECORDS
OFFSET           TYPE              VALUE
[ ... ]
00000000005008c0 R_X86_64_JUMP_SLOT  printf

即。这仍然给出了两步重定向,第一次传输执行到PLT钩,然后跳进库入口点。

I.e. this still gives the two-step redirection, first transfer execution to the PLT hook, then jump into the library entry point.

有没有一种方法,我怎么可以指示编译/汇编/链接器 - 在这个例子 - 内联地址跳转插槽目标 0x400511 ?即取代本地(在程序链接时解析由 LD R_X86_64_64 RELOC与远程(决定在程序加载时间由 ld.so R_X86_64_JUMP_SLOT 一(并迫使非延迟加载为$ C本节$三)?也许链接器映射文件可能使这一切成为可能 - ?如果是这样,如何​​

Is there a way how I can instruct the compiler / assembler / linker to - in this example - "inline" the jump slot target at address 0x400511 ? I.e. replace the "local" (resolved at program link time by ld) R_X86_64_64 reloc with the "remote" (resolved at program load time by ld.so) R_X86_64_JUMP_SLOT one (and force non-lazy-load for this section of code) ? Maybe linker mapfiles might make this possible - if so, how ?

编辑:结果
为了更清楚些,问题是关于/如何实现这一目标的动态链接可执行文件,这只是提供一个动态库外部函数。是的,这是真的静态链接在一个简单的方法解决了这一点,但是:


To make this clear, the question is about how to achieve this in a dynamically-linked executable / for an external function that's only available in a dynamic library. Yes, it's true static linking resolves this in a simpler way, but:


  • 在有些情况下静态库通常不是由供应商运系统(如Solaris)

  • 有哪些不可用作为源$ C ​​$ C或静态版本

因此​​,静态链接是没有帮助的位置:(

Hence static linking is not helpful here :(

EDIT2:结果
我发现,在某些架构(SPARC,值得注意的是,看到SPARC搬迁部分在GNU手动),GNU的使用能够就地链接器创建特定类型搬迁引用的修饰的。带引号的SPARC人会使用%GDOP(符号名称)来使汇编发出指令,连接器,指出创建搬迁就在这里。英特尔的安腾汇编知道 @fptr(符号) link-relocation运营商以同样的事情(见第4节在安腾psABI )。但确实等效机制 - 这指示汇编在了code的特定位置发出特定的链接器重新定位类型 - 存在的x86_64


I've found that in some architectures (SPARC, noticeably, see section on SPARC relocations in the GNU as manual), GNU as is able to create certain types of relocation references for the linker in-place using modifiers. The quoted SPARC one would use %gdop(symbolname) to make the assembler emit instructions to the linker stating "create that relocation right here". Intel's assembler on Itanium knows the @fptr(symbol) link-relocation operator for the same kind of thing (see also section 4 in the Itanium psABI). But does an equivalent mechanism - something to instruct the assembler to emit a specific linker relocation type at a specific position in the code - exist for x86_64 ?

我还发现,GNU汇编有 .reloc节 指令据称是被用于此目的;不过,如果我尝试:

I've also found that the GNU assembler has a .reloc directive which supposedly is to be used for this purpose; still, if I try:

#include <stdio.h>
int main(int argc, char **argv)
{
        asm ("push %%rax\n\t"
             "lea 1f(%%rip), %%rax\n\t"
             "xchg %%rax, (%rsp)\n\t"
             "jmp *0f\n\t"
             ".reloc 0f, R_X86_64_JUMP_SLOT, printf\n\t"
             "0: .quad 0\n"
             "1:\n\t"
             : : "D"("Hello, World!\n"));
        return 0;
}

我从链接器的错误(注: 7 == R_X86_64_JUMP_SLOT ):

error: /tmp/cc6BUEZh.o: unexpected reloc 7 in object file


汇编器创建一个对象文件而<​​code> readelf 说:

Relocation section '.rela.text.startup' at offset 0x5e8 contains 2 entries:
    Offset             Info             Type               Symbol's Value  Symbol's Name + Addend
0000000000000001  000000050000000a R_X86_64_32            0000000000000000 .rodata.str1.1 + 0
0000000000000017  0000000b00000007 R_X86_64_JUMP_SLOT     0000000000000000 printf + 0


这就是我想要的 - 但链接器不把它结果。
链接器的确实的接受只是用 R_X86_64_64 而不是以上;这样做,造成同一种二进制在第一种情况下...重定向到的printf @ PLT 不是解决一...

This is what I want - but the linker doesn't take it.
The linker does accept just using R_X86_64_64 instead above; doing that creates the same kind of binary as in the first case ... redirecting to printf@plt not the "resolved" one...

推荐答案

为了内联,你需要一个code调用(的.text )的搬迁结果是动态加载的共享库函数的最终地址。没有这样的拆迁存在的(和现代静态链接器不允许他们)使用GNU工具链为GNU / Linux在x86_64,因此,你想这样做,你不能内联整个呼叫。

In order to inline the call you would need a code (.text) relocation whose result is the final address of the function in the dynamically loaded shared library. No such relocation exists (and modern static linkers don't allow them) on x86_64 using a GNU toolchain for GNU/Linux, therefore you cannot inline the entire call as you wish to do.

您可以得到最接近的是通过GOT直接调用(避免PLT):

The closest you can get is a direct call through the GOT (avoids PLT):

    .section    .rodata
.LC0:
    .string "Hello, World!\n"
    .text
    .globl  main
    .type   main, @function
main:
    pushq   %rbp
    movq    %rsp, %rbp
    movl    $.LC0, %eax
    movq    %rax, %rdi
    call    *printf@GOTPCREL(%rip)
    nop
    popq    %rbp
    ret
    .size   main, .-main

这应该会在GOT对printf的一个 R_X86_64_GLOB_DAT 搬迁由上面的顺序使用。你需要避免C code,因为在一般的编译器可以使用任意数量的序幕和尾声调用者保存的寄存器,这迫使你保存和恢复各地asm函数调用或风险损坏这些寄存器所有这些寄存器在包装函数以后使用。因此,这是更容易编写在纯装配包装

This should generate a R_X86_64_GLOB_DAT relocation against printf in the GOT to be used by the sequence above. You need to avoid C code because in general the compiler may use any number of caller-saved registers in the prologue and epilogue, and this forces you to save and restore all such registers around the asm function call or risk corrupting those registers for later use in the wrapper function. Therefore it is easier to write the wrapper in pure assembly.

另一种方法是用 -Wl,-z,现在轮候册,-z,relro 编译保证了PLT和PLT相关GOT项在启动时解析增加code局部性和紧凑性。随着全RELRO你只需要在GOT的PLT和访问数据运行code,两件事应该已经在逻辑核心的缓存层次结构的某个地方。如果全RELRO足以满足您的需求,那么你就不需要包装和你将不得不增加安全性的好处。

Another option is to compile with -Wl,-z,now -Wl,-z,relro which ensures the PLT and PLT-related GOT entries are resolved at startup to increase code locality and compactness. With full RELRO you'll only have to run code in the PLT and access data in the GOT, two things which should already be somewhere in the cache hierarchy of the logical core. If full RELRO is enough to meet your needs then you wouldn't need wrappers and you would have added security benefits.

最佳的选择是真的静态链接或LTO如果他们提供给你。

The best options are really static linking or LTO if they are available to you.

这篇关于x86_64的:是否有可能&QUOT;在线替代&QUOT; PLT / GOT参考?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆