难道C编译器优化掉在装配功能,使他们尽量减少使用栈? [英] Do C compilers optimize away functions in assembly so they minimize use of the stack?

查看:187
本文介绍了难道C编译器优化掉在装配功能,使他们尽量减少使用栈?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我开始学习汇编(x86-64的在NASM在OSX),我现在探索的功能如何看待它。

大多数资源,解释如何调用约定工作秀沿此线的例子:

  // C code
MyFunction1(A,B);//组装code
主要:
  推
  压入b
  推RBP;节省堆栈帧指针
  MOV RSP,RBP;保存帧指针堆栈指针
  XOR RAX,RAX;函数返回值设置为0。
  调用_MyFunction
  MOV RBP,RSP;恢复堆栈指针
  流行RBP;恢复帧指针
  RET;返回到调用函数

我只是做了这几个结合后的资源,所以可能有许多问题与,但是这是主要的问题之外。的)。

什么叫像 cdecl调用约定约定,就是要点:


  • 您从他们如何出现在C code推按相反的顺序参数。

  • 您然后保存到两个框架和堆栈指针的引用(我猜这样你就可以递归地做到这一点,但还没有得到那么远,就知道了)。

  • 然后你计算出任何你需要在你的功能。

  • 在这之后,弹出堆栈和帧指针从堆栈

所以,在得到与装配栈和函数调用约定工作的一些实践经验的希望,我希望能看到现有的C编译器如何转换函数调用到组件(gcc和铛,使用的这是伟大的)。但是,的我没有看到那个调用约定模式的(其中每迄今已说我见过的资源是做到这一点)!

快来看看吧,这里是一些C生成的汇编code相对复杂一点:

https://gist.github.com/lancejpollard/a1d6a9b4820473ed8797

看通过C code,有嵌套函数调用的一对夫妇的水平。但输出装配code不表明推/弹出堆栈模式!

所以,问题是,在这些编译器优化只是组装,以避免的那样做呢? (因为,这个例子C code,而有一些嵌套函数,仍然是pretty简单,所以我想象中的编译器可以precompute了很多的东西)。还是我失去了一些东西?


解决方案

在一般:


  • 如果一个函数可以被什么东西在不同的目标文件调用(如函数不是静态和正在创建的目标文件),那么编译器必须尊重调用约定(这样可以连接工作)。

  • 如果你编译调试,然后一个典型的调试器将使用堆栈帧找东西(输入参数,局部变量等),性能不高优先级,所以尊重调用约定是一个好主意。

对于所有其他情况下,编译器就可以完全忽略调用约定,做任何它认为是更有效的(包括在寄存器传递参数,而不是使用堆栈或帧指针,而且还包括完全内联函数)。

不过:<​​/ P>


  • 某些情况下(例如函数指针和功能与参数个数可变)编译器可能无法做太多,可能只需要使用标准调用约定(即使它并不严格需要)。

  • 有可能编译器生成2个或更多不同版本的功能(例如,一个尊重调用约定code在其他目标文件可以使用,加上又是优化版本不尊重调用约定)。

  • 连接器可能会做的优化编译器不能(链接时优化),包括修改调用约定,编译器不能优化功能(如编译器没有可能的呼叫者的足够的知识),但链接器(如连接器看到整个程序)。

I am starting to learn assembly (x86-64 in NASM on OSX), and am now exploring how functions look in it.

Most resources explaining how "calling conventions" work show examples along the lines of this:

// c code
MyFunction1(a, b);

// assembly code
main:
  push a
  push b
  push rbp ; save frame pointer on stack
  mov rsp, rbp  ; save stack pointer in frame pointer
  xor rax, rax  ; set function return value to 0.
  call _MyFunction
  mov rbp, rsp  ; restore stack pointer
  pop rbp ; restore frame pointer
  ret ; return to calling function

(I just made that up after combining several resources, so there are probably many problems with that, but that's outside the main question.).

The gist of what calling conventions like the cdecl calling convention, is that:

  • You push the arguments in reverse order from how they appear in the C code.
  • You then save a reference to both the frame and stack pointers (I'm guessing so you can do this recursively, but haven't got that far yet to know).
  • Then you compute whatever you need to in your function.
  • And after that, pop the stack and frame pointers off the stack

So, in hopes of getting some more practical experience of working with the stack and function calling conventions in assembly, I was hoping to see how existing C compilers converted function calls into assembly (gcc and clang, using this which is great). But, I am not seeing that calling convention pattern (which every resource I've seen so far has said is the way to do it)!

Check it out, here is a relatively complex bit of assembly code generated from some C:

https://gist.github.com/lancejpollard/a1d6a9b4820473ed8797

Looking through that C code, there are a couple levels of nested function calls. But the output assembly code doesn't show that push/pop stack pattern!

So the question is, are these compilers just optimizing the assembly so as to avoid doing that? (Because, that example C code, while having some nested functions, is still pretty simple so I imagine the compiler can precompute a lot of stuff). Or am I missing something?

解决方案

In general:

  • if a function may be called by something in a different object file (e.g. the function isn't static and you are creating an object file) then the compiler has to respect the calling conventions (so that linking can work).
  • if you're compiling for debugging then a typical debugger will use the stack frames to find things (input parameters, local variables, etc) and performance isn't a high priority, so respecting the calling conventions is a good idea.

For all other cases the compiler is free to completely ignore calling conventions and do whatever it thinks is more efficient (including passing parameters in registers and not using the stack or frame pointer, and also including inlining functions completely).

However:

  • for some cases (e.g. function pointers and functions with a variable number of arguments) the compiler may not be able to do much and might just use the standard calling conventions (even when it doesn't strictly need to).
  • it is possible for the compiler to generate 2 or more different versions of a function (e.g. one that respects the calling convention that code in other object files can use, plus another optimised version that doesn't respect calling convention).
  • the linker may do optimisations that the compiler couldn't (link time optimisation), including modifying the calling conventions for functions that the compiler couldn't optimise (as the compiler didn't have enough knowledge of potential callers) but the linker can (as the linker "sees" the whole program).

这篇关于难道C编译器优化掉在装配功能,使他们尽量减少使用栈?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆