32位x86组件中堆栈对齐的责任 [英] Responsibility of stack alignment in 32-bit x86 assembly

查看:227
本文介绍了32位x86组件中堆栈对齐的责任的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我试图清楚地了解谁(调用方或被调用方)负责堆栈对齐. 64位汇编的情况很清楚,它是由 caller 调用的.

I am trying to get a clear picture of who (caller or callee) is reponsible of stack alignment. The case for 64-bit assembly is rather clear, that it is by caller.

请参阅系统V AMD64 ABI,第3.2.2节堆栈框架:

Referring to System V AMD64 ABI, section 3.2.2 The Stack Frame:

输入自变量区域的末尾应对齐16(32,如果 __m256在堆栈上传递)字节边界.

The end of the input argument area shall be aligned on a 16 (32, if __m256 is passed on stack) byte boundary.

换句话说,对于被调用函数的每个入口点,应该安全:

In other words, it should be safe to assume, that for every entry point of called function:

16 | (%rsp + 8)

保持(额外的8个是因为call隐式地将返回地址压入堆栈).

holds (extra eight is because call implicitely pushes return address on stack).

在32位环境中(假设cdecl)它的外观如何?我注意到gcc使用以下构造将对齐方式放入被调用函数内:

How it looks in 32-bit world (assuming cdecl)? I noticed that gcc places the alignment inside the called function with following construct:

and esp, -16

这似乎表明,这是被告人的责任.

which seems to indicate, that is callee's responsibility.

更清楚地说,请考虑以下NASM代码:

To put it clearer, consider following NASM code:

global main
extern printf
extern scanf
section .rodata
    s_fmt   db "%d %d", 0
    s_res   db `%d with remainder %d\n`, 0
section .text
main:
    start   0, 0
    sub     esp, 8
    mov     DWORD [ebp-4], 0 ; dividend
    mov     DWORD [ebp-8], 0 ; divisor

    lea     eax, [ebp-8]
    push    eax
    lea     eax, [ebp-4]
    push    eax
    push    s_fmt
    call    scanf
    add     esp, 12

    mov     eax, [ebp-4]
    cdq
    idiv    DWORD [ebp-8]

    push    edx
    push    eax
    push    s_res
    call    printf

    xor     eax, eax
    leave
    ret

在调用scanf之前是否需要对齐堆栈?如果是这样,则在将这两个参数推为scanf之前,需要将%esp减少四个字节:

Is it required to align the stack before scanf is called? If so, then this would require to decrease %esp by four bytes before pushing these two arguments to scanf as:

4 bytes (return address)
4 bytes (%ebp of previous stack frame)
8 bytes (for two variables)
12 bytes (three arguments for scanf)
= 28

推荐答案

仅GCC main中进行了额外的堆栈对齐;该功能很特殊.,如果您查看其他任何功能的代码源,除非您拥有带有alignas(32)或类似内容的本地语言,否则您将看不到它.

GCC only does this extra stack alignment in main; that function is special. You won't see it if you look at code-gen for any other function, unless you have a local with alignas(32) or something.

GCC只是对-m32采取了防御性的方法,不假定main是通过正确的16B对齐的堆栈调用的.或者,当-mpreferred-stack-boundary=4只是一个好主意,而不是法律时,就留下了这种特殊待遇.

GCC is just taking a defensive approach with -m32, by not assuming that main is called with a properly 16B-aligned stack. Or this special treatment is left over from when -mpreferred-stack-boundary=4 was only a good idea, not the law.

i386 System V ABI多年来一直保证/要求ESP + 4在功能上进行16B对齐. (即ESP必须在CALL指令之前 对齐16B,因此堆栈上的args从16B边界开始.这与x86-64 System V相同.)

The i386 System V ABI has guaranteed/required for years that ESP+4 is 16B-aligned on entry to a function. (i.e. ESP must be 16B-aligned before a CALL instruction, so args on the stack start at a 16B boundary. This is the same as for x86-64 System V.)

ABI还保证新的32位进程以在16B边界上对齐的ESP开始(例如,在_start,ELF入口点,ESP指向argc,而不是返回地址),以及glibc CRT代码保持一致.

The ABI also guarantees that new 32-bit processes start with ESP aligned on a 16B boundary (e.g. at _start, the ELF entry point, where ESP points at argc, not a return address), and the glibc CRT code maintains that alignment.

就调用约定而言,EBP只是另一个保留呼叫的寄存器.但是,是的,使用-fno-omit-frame-pointer的编译器输出确实会在其他调用保留寄存器(例如EBX)之前处理push ebp,因此保存的EBP值形成了一个链表. (因为它也执行mov ebp, esp部分,即在该推送之后设置帧指针.)

As far as the calling convention is concerned, EBP is just another call-preserved register. But yes, compiler output with -fno-omit-frame-pointer does take care to push ebp before other call-preserved registers (like EBX) so the saved EBP values form a linked list. (Because it also does the mov ebp, esp part of setting up a frame pointer after that push.)

gcc也许是防御性的,因为一个非常古老的Linux内核(从该版本升级到i386 ABI之前,当时所需的对齐方式仅为4B)可能违反了这一假设,而且这只是一条额外的指令,在生命周期中只能运行一次-处理时间(假设程序没有递归调用main).

Perhaps gcc is defensive because an extremely ancient Linux kernel (from before that revision to the i386 ABI, when the required alignment was only 4B) could violate that assumption, and it's only an extra couple instructions that run once in the life-time of the process (assuming the program doesn't call main recursively).

与gcc不同,clang假定堆栈在进入main时已正确对齐. (clang还在

Unlike gcc, clang assumes the stack is properly aligned on entry to main. (clang also assumes that narrow args have been sign or zero-extended to 32 bits, even though the current ABI revision doesn't specify that behaviour (yet). gcc and clang both emit code that does in the caller side, but only clang depends on it in the callee. This happens in 64-bit code, but I didn't check 32-bit.)

http://gcc.godbolt.org/上查看编译器输出以了解其他主要功能比主要的要好奇.

Look at compiler output on http://gcc.godbolt.org/ for main and functions other than main if you're curious.

我刚刚更新了 http://x86-64.org/仍然死了,似乎还没有回来,所以我更新了System V链接,以指向HJ Lu的github存储库中当前修订版的PDF,并且

I just updated the ABI links in the x86 tag wiki the other day. http://x86-64.org/ is still dead and seems to be not coming back, so I updated the System V links to point to the PDFs of the current revision in HJ Lu's github repo, and his page with links.

请注意,SCO网站上的最新版本不是 当前版本,并且不包括16B-stack-alignment要求.

Note that the last version on SCO's site is not the current revision, and doesn't include the 16B-stack-alignment requirement.

我认为某些BSD版本仍然不需要/保持16字节堆栈对齐.

I think some BSD versions still don't require / maintain 16-byte stack alignment.

这篇关于32位x86组件中堆栈对齐的责任的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆