gcc x86-32堆栈对齐并调用printf [英] gcc x86-32 stack alignment and calling printf

查看:119
本文介绍了gcc x86-32堆栈对齐并调用printf的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

据我所知,x86-64要求在调用之前将堆栈对齐16字节,而带有-m32不需要main .

To the best of my knowledge, x86-64 requires the stack to be 16-byte aligned before a call, while gcc with -m32 doesn't require this for main.

我有以下测试代码:

.data
intfmt:         .string "int: %d\n"
testint:        .int    20

.text
.globl main

main:
    mov     %esp, %ebp
    push    testint
    push    $intfmt
    call    printf
    mov     %ebp, %esp
    ret

使用as --32 test.S -o test.o && gcc -m32 test.o -o test构建.我知道存在syscall写入,但是据我所知,它无法以printf的方式打印ints和float.

Build with as --32 test.S -o test.o && gcc -m32 test.o -o test. I am aware that syscall write exists, but to my knowledge it cannot print ints and floats the way printf can.

进入main后,堆栈上有一个4字节的返回地址.然后,天真地解释此代码,这两个push调用每个都将4个字节放入堆栈中,因此该调用需要另一个4字节的值被压入以进行对齐.

After entering main, a 4 byte return address is on the stack. Then interpreting this code naively, the two push calls each put 4 bytes on the stack, so call needs another 4 byte value pushed to be aligned.

这是gas和gcc生成的二进制文件的objdump:

Here is the objdump of the binary generated by gas and gcc:

0000053d <main>:
 53d:   89 e5                   mov    %esp,%ebp
 53f:   ff 35 1d 20 00 00       pushl  0x201d
 545:   68 14 20 00 00          push   $0x2014
 54a:   e8 fc ff ff ff          call   54b <main+0xe>
 54f:   89 ec                   mov    %ebp,%esp
 551:   c3                      ret    
 552:   66 90                   xchg   %ax,%ax
 554:   66 90                   xchg   %ax,%ax
 556:   66 90                   xchg   %ax,%ax
 558:   66 90                   xchg   %ax,%ax
 55a:   66 90                   xchg   %ax,%ax
 55c:   66 90                   xchg   %ax,%ax
 55e:   66 90                   xchg   %ax,%ax

我对生成的推送指令感到非常困惑.

I am very confused about the push instructions generated.

  1. 如果推入两个4字节值,如何实现对齐?
  2. 为什么要推送0x2014而不是0x14?什么是0x201d?
  3. call 54b甚至能实现什么? hd的输出与objdump匹配.为什么在gdb中有什么不同?这是动态链接器吗?
  1. If two 4 byte values are pushed, how is alignment achieved?
  2. Why is 0x2014 pushed instead of 0x14? What is 0x201d?
  3. What does call 54b even achieve? Output of hd matches objdump. Why is this different in gdb? Is this the dynamic linker?

B+>│0x5655553d <main>                       mov    %esp,%ebp                      │
   │0x5655553f <main+2>                     pushl  0x5655701d                     │
   │0x56555545 <main+8>                     push   $0x56557014                    │
   │0x5655554a <main+13>                    call   0xf7e222d0 <printf>            │
   │0x5655554f <main+18>                    mov    %ebp,%esp                      │
   │0x56555551 <main+20>                    ret  

感谢人们对二进制程序实际执行时发生的情况的资源,因为我不知道实际发生了什么,而我所阅读的教程也没有介绍它.我正在阅读程序如何运行:ELF二进制文件.

Resources on what goes on when a binary is actually executed are appreciated, since I don't know what's actually going on and the tutorials I've read don't cover it. I'm in the process of reading through How programs get run: ELF binaries.

推荐答案

i386 System V ABI 确实保证/在call之前需要16字节的堆栈对齐,就像我在顶部所说的那样.您链接的我的答案. (除非您要调用私有帮助器函数,否则您可以制定自己的对齐,arg传递规则以及该函数的寄存器被破坏.)

The i386 System V ABI does guarantee / require 16 byte stack alignment before a call, like I said at the top of my answer that you linked. (Unless you're calling a private helper function, in which case you can make up your own rules for alignment, arg-passing, and which registers are clobbered for that function.)

如果您违反此ABI要求,但允许崩溃或行为不正常,.例如, x86-64 Ubuntu glibc(由最新的gcc编译)中的scanf直到最近才开始执行此操作: scanf从不执行此功能的函数调用时出现分段错误不要更改RSP

Functions are allowed to crash or misbehave if you violate this ABI requirement, but are not required to. e.g. scanf in x86-64 Ubuntu glibc (as compiled by recent gcc) only recently started doing that: scanf Segmentation faults when called from a function that doesn't change RSP

函数的性能可能取决于堆栈对齐(对齐doubledouble的数组以避免访问时发生高速缓存行拆分).

Functions can depend on stack alignment for performance (to align a double or array of doubles to avoid cache-line splits when accessing them).

通常,唯一的函数依赖于正确性的堆栈对齐方式是在编译时使用SSE/SSE2的情况,因此它可以使用需要16字节对齐方式的加载/存储来复制结构或数组(movapsmovdqa),或者实际上是自动向量化本地数组上的循环.

Usually the only case where a function depends on stack alignment for correctness is when compiled to use SSE/SSE2, so it can use 16-byte alignment-required loads/stores to copy a struct or array (movaps or movdqa), or to actually auto-vectorize a loop over a local array.

我认为Ubuntu不会使用SSE编译其32位库(除了memcpy之类的使用运行时调度的功能),因此它们仍然可以在Pentium II之类的古老CPU上运行.在x86-64系统上,多体系结构库应该采用SSE2,但是使用4字节指针时,32位函数复制16字节结构的可能性较小.

I think Ubuntu doesn't compile their 32-bit libraries with SSE (except functions like memcpy that use runtime dispatching), so they can still work on ancient CPUs like Pentium II. Multiarch libraries on an x86-64 system should assume SSE2, but with 4-byte pointers it's less likely that 32-bit functions would have 16 byte structs to copy.

无论如何,无论出于何种原因,显然在您的32位glibc版本中printf实际上都不依赖于16字节堆栈对齐以确保正确性,因此即使您未对齐堆栈也不会出错.

Anyway, whatever the reason, obviously printf in your 32-bit build of glibc doesn't actually depend on 16-byte stack alignment for correctness, so it doesn't fault even when you misalign the stack.

为什么要推送0x2014而不是0x14?什么是0x201d?

Why is 0x2014 pushed instead of 0x14? What is 0x201d?

0x14(十进制20)是该位置内存中的值.它将在运行时加载,因为您使用的是push r/m32,而不是push $20(或诸如.equ testint, 20testint = 20这样的汇编时间常数).

0x14 (decimal 20) is the value in memory at that location. It will be loaded at runtime, because you used push r/m32, not push $20 (or an assemble time constant like .equ testint, 20 or testint = 20).

您使用gcc -m32制作了一个PIE(位置独立的可执行文件),该文件在运行时重新放置,因为这是Ubuntu的gcc的默认设置.

You used gcc -m32 to make a PIE (Position Independent Executable), which is relocated at runtime, because that's the default on Ubuntu's gcc.

0x2014是相对于文件开头的偏移量.如果您在运行程序后在运行时反汇编,则会看到一个真实的地址.

0x2014 is the offset relative to the start of the file. If you disassemble at runtime after running the program, you'll see a real address.

call 54b相同.大概是对PLT的调用(它位于文件/文本段的开头附近,因此是低地址).

Same for call 54b. It's presuambly a call to the PLT (which is near the start of the file / text segment, hence the low address).

如果使用objdump -drwC进行反汇编,则会看到符号重定位信息. (我也喜欢-Mintel,但请注意,它类似于MASM,而不是NASM).

If you disassembled with objdump -drwC, you'd see symbol relocation info. (I like -Mintel as well, but beware it's MASM-like, not NASM).

您可以与gcc -m32 -no-pie链接,以生成经典的依赖于可执行文件.我绝对建议特别是对于32位代码,尤其是在编译C的情况下,请使用gcc -m32 -no-pie -fno-pie获取非PIE代码源以及链接到非PIE可执行文件. (请参见不再允许使用32位绝对地址在x86-64 Linux中?了解有关PIE的更多信息.)

You can link with gcc -m32 -no-pie to make classic position-dependent executables. I'd definitely recommend that especially for 32-bit code, and especially if you're compiling C, use gcc -m32 -no-pie -fno-pie to get non-PIE code-gen as well as linking into a non-PIE executable. (see 32-bit absolute addresses no longer allowed in x86-64 Linux? for more about PIEs.)

这篇关于gcc x86-32堆栈对齐并调用printf的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆