什么是C的Hello World大会输出的每一行的意义? [英] What is the meaning of each line of the assembly output of a C hello world?

查看:143
本文介绍了什么是C的Hello World大会输出的每一行的意义?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我跑GCC -S在这个:

I ran gcc -S over this:

int main()
{
printf ("Hello world!");
}

和我得到这个组装code:

and I got this assembly code:

        .file   "test.c"
        .section        .rodata
.LC0:
        .string "Hello world!"
        .text
.globl main
        .type   main, @function
main:
        leal    4(%esp), %ecx
        andl    $-16, %esp
        pushl   -4(%ecx)
        pushl   %ebp
        movl    %esp, %ebp
        pushl   %ecx
        subl    $20, %esp
        movl    $.LC0, (%esp)
        call    printf
        addl    $20, %esp
        popl    %ecx
        popl    %ebp
        leal    -4(%ecx), %esp
        ret
        .size   main, .-main
        .ident  "GCC: (GNU) 4.3.0 20080428 (Red Hat 4.3.0-8)"
        .section        .note.GNU-stack,"",@progbits

我好奇地理解这个输出。有人可以分享理解这个输出一些指点,或者如果有人可能标志着对每条线路的意见/线组解释它做什么,那就太棒了。

I am curious to understand this output. Can someone share some pointers in understanding this output, or if someone could mark comments against each of these lines/group of lines explaining what it does it would be great.

推荐答案

呢?这里怎么一回事:

        .file   "test.c"

最初的源文件名(由调试器使用)。

The original source file name (used by debuggers).

        .section        .rodata
.LC0:
        .string "Hello world!"

一个0结尾的字符串包含在一节.RODATA(RO的意思是只读:应用程序将能够读取数据,但写入任何企图将触发异常)

A zero-terminated string is included in the section ".rodata" ("ro" means "read-only": the application will be able to read the data, but any attempt at writing into it will trigger an exception).

        .text

现在我们写的东西进入的.text部分,这是在code去。

Now we write things into the ".text" section, which is where code goes.

.globl main
        .type   main, @function
main:

我们定义了一个调用的函数主和全局可见(其他目标文件就能调用它)。

We define a function called "main" and globally visible (other object files will be able to invoke it).

        leal    4(%esp), %ecx

我们存储在寄存器%ECX 4 +%ESP %ESP 是堆栈指针)。

We store in register %ecx the value 4+%esp (%esp is the stack pointer).

        andl    $-16, %esp

%ESP 略微修改,使之成为16的倍数。对于某些数据类型(相当于C中的浮点格式双长双),性能更好,当内存访问是在其多个16这是不是真的在这里需要地址,但是当未经优化标志( -O2 ...),编译器往往会产生相当多的通用无用code(即code这可能是在某些情况下非常有用,但不是在这里)。

%esp is slightly modified so that it becomes a multiple of 16. For some data types (the floating-point format corresponding to C's double and long double), performance is better when the memory accesses are at addresses which are multiple of 16. This is not really needed here, but when used without the optimization flag (-O2...), the compiler tends to produce quite a lot of generic useless code (i.e. code which could be useful in some cases but not here).

        pushl   -4(%ecx)

这一个是有点不可思议:在这一点上,在地址字 -4(%ECX)是这是在之前栈顶的字和L 。在code检索词(这应该是返回地址的方式),并再次推动它。这种模拟什么将与从其中有一个16字节对齐堆叠的函数的调用来获得。我的猜测是,这个是一个参数 - 复制序列的残余。由于功能调整堆栈指针,它必须复制函数的参数,这是通过堆栈指针的旧值进行访问。这里,没有参数,除了该函数的返回地址。注意,这个字将不被使用(再一次,这是code未经优化)。

This one is a bit weird: at that point, the word at address -4(%ecx) is the word which was on top of the stack prior to the andl. The code retrieves that word (which should be the return address, by the way) and pushes it again. This kind of emulates what would be obtained with a call from a function which had a 16-byte aligned stack. My guess is that this push is a remnant of an argument-copying sequence. Since the function has adjusted the stack pointer, it must copy the function arguments, which were accessible through the old value of the stack pointer. Here, there is no argument, except the function return address. Note that this word will not be used (yet again, this is code without optimization).

        pushl   %ebp
        movl    %esp, %ebp

这是标准的函数序言:我们节省的%ebp (因为我们将要修改的话),然后设置的%ebp 指向堆栈帧。此后,的%ebp 将被用于访问函数的参数,使%ESP 重获自由。 (是的,有没有参数,所以这是无用的该功能。)

This is the standard function prologue: we save %ebp (since we are about to modify it), then set %ebp to point to the stack frame. Thereafter, %ebp will be used to access the function arguments, making %esp free again. (Yes, there is no argument, so this is useless for that function.)

        pushl   %ecx

我们节省%ECX (我们需要它在函数退出,以恢复%ESP 的值时,它收到的和L )。

We save %ecx (we will need it at function exit, to restore %esp at the value it had before the andl).

        subl    $20, %esp

我们保留堆栈32个字节(记住,堆栈增长下)。该空间将被用来storea的参数的printf()(这是矫枉过正,因为有一个参数,它使用4个字节[这是一个指针])。

We reserve 32 bytes on the stack (remember that the stack grows "down"). That space will be used to storea the arguments to printf() (that's overkill, since there is a single argument, which will use 4 bytes [that's a pointer]).

        movl    $.LC0, (%esp)
        call    printf

我们的推的说法,以的printf()(即我们确保%ESP 指向字包含的说法,这里 $。LC0 ,这是在rodata部分常量字符串的地址)。然后我们调用的printf()

We "push" the argument to printf() (i.e. we make sure that %esp points to a word which contains the argument, here $.LC0, which is the address of the constant string in the rodata section). Then we call printf().

        addl    $20, %esp

的printf()的回报,我们会删除分配给参数的空间。这 ADDL 取消什么 subl 上面的一样。

When printf() returns, we remove the space allocated for the arguments. This addl cancels what the subl above did.

        popl    %ecx

我们恢复%ECX (上推); 的printf()可能已修改它(调用约定描述了寄存器可以不恢复他们在退出函数修改; %ECX 就是这样的一个寄存器)。

We recover %ecx (pushed above); printf() may have modified it (the call conventions describe which register can a function modify without restoring them upon exit; %ecx is one such register).

        popl    %ebp

功能结语:这个恢复的%ebp (对应于 pushl%ebp的以上)

        leal    -4(%ecx), %esp

我们恢复%ESP 其初始值。这款运算code的作用是%ESP 来存储值%ECX-4 %ECX 中的第一个函数运算code的设置。这将取消任何修改%ESP ,其中和L

We restore %esp to its initial value. The effect of this opcode is to store in %esp the value %ecx-4. %ecx was set in the first function opcode. This cancels any alteration to %esp, including the andl.

        ret

函数退出。

        .size   main, .-main

此设置的main()函数的大小:在组装过程中的任何一点, 是为地址在哪,我们现在加入的东西的别名。如果在这里增加了一个指令,它会去通过在指定的地址。因此, .-主,在这里,是函数的code的确切大​​小的main()。在 .size 指令指示汇编程序将在目标文件中的信息。

This sets the size of the main() function: at any point during assembly, "." is an alias for "the address at which we are adding things right now". If another instruction was added here, it would go at the address specified by ".". Thus, ".-main", here, is the exact size of the code of the function main(). The .size directive instructs the assembler to write that information in the object file.

        .ident  "GCC: (GNU) 4.3.0 20080428 (Red Hat 4.3.0-8)"

GCC只是喜欢留下它的行动的痕迹。这个字符串结束了作为一种目标文件中的注释。连接器将其删除。

GCC just loves to leave traces of its action. This string ends up as a kind of comment in the object file. The linker will remove it.

        .section        .note.GNU-stack,"",@progbits

一个特殊的部分,在那里GCC写道,code可以容纳非可执行堆栈。这是正常的情况下。需要对一些特殊用途(不是标准C)可执行堆栈。在现代的处理器,内核可以使非可执行堆栈(堆栈,如果有人试图执行为code一些数据是在栈上触发一个例外);这被一些人看作是一个安全功能,因为把code堆栈上是利用缓冲区溢出的常用方法。本条可执行文件将被标记作为内核乐意为这样用非可执行堆栈兼容。

A special section where GCC writes that the code can accommodate a non-executable stack. This is the normal case. Executable stacks are needed for some special usages (not standard C). On modern processors, the kernel can make a non-executable stack (a stack which triggers an exception if someone tries to execute as code some data which is on the stack); this is viewed by some people as a "security feature" because putting code on the stack is a common way to exploit buffer overflows. With this section, the executable will be marked as "compatible with a non-executable stack" which the kernel will happily provide as such.

这篇关于什么是C的Hello World大会输出的每一行的意义?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆