我们何时在函数中创建基指针-在局部变量之前还是之后? [英] When do we create base pointer in a function - before or after local variables?

查看:57
本文介绍了我们何时在函数中创建基指针-在局部变量之前还是之后?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在阅读从头开始编程书.我看到两个不同的示例,说明如何从当前堆栈位置%esp 创建基本指针 %ebp .

在一种情况下,它是在局部变量之前完成的.

  _start:#初始化程序subl $ ST_SIZE_RESERVE,%esp#在#堆栈(此文件描述符# 案件)move%esp,%ebp 

_start 但是与其他函数不同,它是程序的入口点.

在另一种情况下,则在此之后完成.

 电源:pushl%ebp#保存旧的基本指针movl%esp,%ebp#使堆栈指针成为基本指针subl $ 4,%esp#为我们的本地存储空间 

所以我的问题是,我们首先要在堆栈中为局部变量保留空间并创建 base指针还是首先创建 base指针然后为局部变量保留空间?

即使我将它们混在一起在程序的不同功能中,这两种方法也都不会奏效吗?一个功能在此之前执行,另一个功能在之后执行.等等. C 在创建机器代码时是否有特定的约定?

我的推理是,函数中的所有代码都将相对于基本指针,因此只要该函数遵循创建堆栈引用的约定,它就可以有效吗?

感兴趣的相关链接很少:

功能序言

解决方案

在一种情况下,它是在局部变量之前完成的.

_start 不是函数.这是您的切入点.没有寄信人地址,也没有要保存的呼叫者%ebp 的值.

i386 System V ABI文档建议(在 2.3.1初始堆栈和寄存器状态)中,您可能希望将%ebp设置为零以标记最深的堆栈帧.(即,在您的第一个 call 指令之前,因此,当第一个函数按置零的 ebp 时,保存的 ebp 值的链表具有NULL终止符.见下文).

C在创建机器代码时是否有特定的约定?

否,与某些其他x86系统不同, i386系统VABI 不需要太多关于您的堆栈框架布局的信息.(Linux使用System V ABI/调用约定,而您正在使用的书(PGU)是针对Linux的.)

在某些调用约定中,设置 ebp 不是可选的,并且函数输入序列必须将 ebp 推送到返回地址下方.这将创建堆栈帧的链接列表,该列表允许异常处理程序(或调试器)回溯堆栈.(如何通过查看堆栈值?).我认为,至少在某些情况下,至少在某些情况下,SEH(结构化异常处理)的32位Windows代码中需要这样做,但详细信息请参见IDK.

i386 SysV ABI定义了另一种堆栈展开机制,该机制使帧指针成为可选,使用另一部分中的元数据(通过行走框架指针进行gdb回溯 gdb将使用基于EBP的堆栈遍历方法.

如果使用 gcc -fno-omit-frame-pointer 进行编译,则您的调用堆栈将具有此链接列表属性,因为当C编译器执行制作适当的堆栈帧,它们会先按 ebp .

IIRC, perf 具有一种在分析时使用框架指针链获取回溯的模式,显然,这比默认的 .eh_frame 东西更可靠正确计算哪些功能占用了最多的CPU时间.(或导致最多的高速缓存未命中,分支错误预测或您使用性能计数器计算的其他任何事情.)


即使我将它们混合在一个程序的不同功能中,这两种方法也都不会奏效吗?一个功能在此之前执行,另一功能在等之后执行.

是的,它可以正常工作.实际上完全设置 ebp 是可选的,但是在手工书写时,具有固定的基数会更容易(与 esp 不同,当您按下/弹出时,esp会四处移动)

.

出于相同的原因,在一次按一下(旧的%ebp 的)后,更容易遵循 mov%esp,%ebp 的约定,因此第一个函数arg始终位于 ebp + 8 .参见组装中的堆栈框架是什么?

但是您也许可以通过将 ebp 指向您保留的某些空间的中间来节省代码大小,因此所有可通过 ebp + disp8 寻址方式访问的内存为可用的.( disp8 是带符号的8位位移:如果我们限制为4字节对齐的位置,则为-128至+124).与需要更远的Disp32相比,这节省了代码字节.所以你可能会做

  bigfunc:推送%ebplea -112(%esp),%ebp#第一个arg在ebp + 8 + 112 = 120(%ebp)sub $ 236,%esp#Locals from -124(%ebp)... 108(%ebp)#将EBP保存为112(%ebp),将ret addr保存为116(%ebp)选择#236以使%esp保持16字节对齐. 

或者将任何寄存器的保存时间推迟到为本地人保留空间之后,这样我们就不会用任何我们不想处理的保存值(除ret addr之外)用完.

  bigfunc2:#第一个arg为4(%esp)sub $ 252,%esp#第一个参数为252 + 4(%esp)在252 + 4 + 4(%esp)处推送%ebp#第一个arglea 140(%esp),%ebp#260-140处的第一个参数= 120(%ebp)push%edi#保存其他保留呼叫的规则推送%esi推送%ebx#%esp在这些推送后对齐16字节,以防万一 

(请记住,如何还原寄存器和清理.您不能使用 leave ,因为 esp = ebp 是不正确的.使用正常"堆栈帧序列,您可以使用 mov 恢复其他压入的寄存器(从保存的EBP附近),然后使用 leave 或恢复 esp 指向在最后一次推送时(使用 add ),并使用 pop 指令.)

但是,如果要执行此操作,则使用 ebp 代替 ebx 或其他方法没有任何优势.实际上,使用 ebp 有一个缺点: 0(%ebp)寻址模式要求disp8为0,而不是没有位移,而是%ebx 不会.因此,将%ebp 用于非指针暂存器.或至少有一个您不会在没有置换的情况下取消引用的方法.(这个怪癖与真实的帧指针无关:(%ebp)是保存的EBP值.BTW是指表示(%ebp)不移位的编码.这是ModRM字节如何对没有基址寄存器的disp32进行编码的,例如(12345) my_label )

这些例子是虚假的.除非它是一个数组,否则通常不需要太多的空间供本地人使用,然后您将使用索引寻址模式或指针,而不仅仅是相对于 ebp 的disp8.但是也许您需要一些32字节AVX向量的空间.在只有8个向量寄存器的32位代码中,这是合理的.

但是,对于64字节AVX512向量,AVX512压缩的disp8在大多数情况下都是无效的.(但是在32位模式下,AVX512仍只能使用8个向量寄存器zmm0-zmm7,因此您很容易洒出一些.在64位模式下,您只能得到x/ymm8-15和zmm8-31.)

I am reading the Programming From Ground Up book. I see two different examples of how the base pointer %ebp is created from the current stack position %esp.

In one case, it is done before the local variables.

_start:
        # INITIALIZE PROGRAM
        subl  $ST_SIZE_RESERVE, %esp       # Allocate space for pointers on the
                                           # stack (file descriptors in this
                                           # case)
        movl  %esp, %ebp

The _start however is not like other functions, it is the entry point of the program.

In another case it is done after.

power:
        pushl %ebp           # Save old base pointer
        movl  %esp, %ebp     # Make stack pointer the base pointer
        subl  $4, %esp       # Get room for our local storage

So my question is, do we first reserve space for local variables in the stack and create the base pointer or first create the base pointer and then reserve space for local variables?

Wouldn't both just work even if I mix them up in different functions of a program? One function does it before, the other does it after etc. Does C have a specific convention when it creates the machine code?

My reasoning is that all the code in a function would be relative to the base pointer, so as long as that function follows the convention according to which it created a reference of the stack, it just works?

Few related links for those are interested:

Function Prologue

解决方案

In one case, it is done before the local variables.

_start is not a function. It's your entry point. There's no return address, and no caller's value of %ebp to save.

The i386 System V ABI doc suggests (in section 2.3.1 Initial Stack and Register State) that you might want to zero %ebp to mark the deepest stack frame. (i.e. before your first call instruction, so the linked list of saved ebp values has a NULL terminator when that first function pushes the zeroed ebp. See below).

Does C have a specific convention when it creates the machine code?

No, unlike in some other x86 systems, the i386 System V ABI doesn't require much about your stack-frame layout. (Linux uses the System V ABI / calling convention, and the book you're using (PGU) is for Linux.)

In some calling conventions, setting up ebp is not optional, and the function entry sequence has to push ebp just below the return address. This creates a linked list of stack frames which allows an exception handler (or debugger) to backtrace up the stack. (How to generate the backtrace by looking at the stack values?). I think this is required in 32-bit Windows code for SEH (structured exception handling), at least in some cases, but IDK the details.

The i386 SysV ABI defines an alternate mechanism for stack unwinding which makes frame pointers optional, using metadata in another section (.eh_frame and .eh_frame_hdr which contains metadata created by .cfi_... assembler directives, which in theory you could write yourself if you wanted stack-unwinding through your function to work. i.e. if you were calling any C++ code which expected throw to work.)

If you want to use the traditional frame-walking in current gdb, you have to actually do it yourself by defining a GDB function like gdb backtrace by walking frame pointers or Force GDB to use frame-pointer based unwinding. Or apparently if your executable has no .eh_frame section at all, gdb will use the EBP-based stack-walking method.

If you compile with gcc -fno-omit-frame-pointer, your call stack will have this linked-list property, because when C compilers do make proper stack frames, they push ebp first.

IIRC, perf has a mode for using the frame-pointer chain to get backtraces while profiling, and apparently this can be more reliable than the default .eh_frame stuff for correctly accounting which functions are responsible for using the most CPU time. (Or causing the most cache misses, branch mispredicts, or whatever else you're counting with performance counters.)


Wouldn't both just work even if I mix them up in different functions of a program? One function does it before, the other does it after etc.

Yes, it would work fine. In fact setting up ebp at all is optional, but when writing by hand it's easier to have a fixed base (unlike esp which moves around when you push/pop).

For the same reason, it's easier to stick to the convention of mov %esp, %ebp after one push (of the old %ebp), so the first function arg is always at ebp+8. See What is stack frame in assembly? for the usual convention.

But you could maybe save code size by having ebp point in the middle of some space you reserved, so all the memory addressable with an ebp + disp8 addressing mode is usable. (disp8 is a signed 8-bit displacement: -128 to +124 if we're limiting to 4-byte aligned locations). This saves code bytes vs. needing a disp32 to reach farther. So you might do

bigfunc:
    push   %ebp
    lea    -112(%esp), %ebp   # first arg at ebp+8+112 = 120(%ebp)
    sub    $236, %esp         # locals from -124(%ebp) ... 108(%ebp)
                              # saved EBP at 112(%ebp), ret addr at 116(%ebp)
                              # 236 was chosen to leave %esp 16-byte aligned.

Or delay saving any registers until after reserving space for locals, so we aren't using up any of the locations (other than the ret addr) with saved values we never want to address.

bigfunc2:                     # first arg at 4(%esp)
    sub    $252, %esp         # first arg at 252+4(%esp)
    push   %ebp               # first arg at 252+4+4(%esp)
    lea    140(%esp), %ebp    # first arg at 260-140 = 120(%ebp)

    push   %edi              # save the other call-preserved regs
    push   %esi
    push   %ebx
             # %esp is 16-byte aligned after these pushes, in case that matters

(Remember to be careful how you restore registers and clean up. You can't use leave because esp = ebp isn't right. With the "normal" stack frame sequence, you might restore other pushed registers (from near the saved EBP) with mov, then use leave. Or restore esp to point at the last push (with add), and use pop instructions.)

But if you're going to do this, there's no advantage to using ebp instead of ebx or something. In fact, there's a disadvantage to using ebp: the 0(%ebp) addressing mode requires a disp8 of 0, instead of no displacement, but %ebx wouldn't. So use %ebp for a non-pointer scratch register. Or at least one that you don't dereference without a displacement. (This quirk is irrelevant with a real frame pointer: (%ebp) is the saved EBP value. And BTW, the encoding that would mean (%ebp) with no displacement is how the ModRM byte encodes a disp32 with no base register, like (12345) or my_label)

These example are pretty artifical; you usually don't need that much space for locals unless it's an array, and then you'd use indexed addressing modes or pointers, not just a disp8 relative to ebp. But maybe you need space for a few 32-byte AVX vectors. In 32-bit code with only 8 vector registers, that's plausible.

AVX512 compressed disp8 mostly defeats this argument for 64-byte AVX512 vectors, though. (But AVX512 in 32-bit mode can still only use 8 vector registers, zmm0-zmm7, so you could easily need to spill some. You only get x/ymm8-15 and zmm8-31 in 64-bit mode.)

这篇关于我们何时在函数中创建基指针-在局部变量之前还是之后?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆