了解一些汇编语句的目的 [英] Understanding the purpose of some assembly statements

查看:217
本文介绍了了解一些汇编语句的目的的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想了解一些总成code和管理来完成大部分,除了几行。我能够理解大部分什么内部发生的,但我不能完全理解什么(为什么)开头发生和code的结束。有人可以阐明这一些轻?

  INT的main(){
    诠释A,B;
    以a = 12;
    B = 20;
    B = A + 123;
    返回0;
}

反汇编版本:

  8048394:8D 4C 24 04 LEA为0x4(%ESP),ECX%; ??
 8048398:83 E4 f0和$ 0xfffffff0,%ESP; ??
 804839b:FF 71的FC pushl -0x4(ECX%); ??
 804839e:55推%EBP;存储基指针
 804839f:89 E5 MOV%ESP,EBP%;初始化堆栈指针的指针基地
 80483a1:51推%ECX; ??
 80483a2:83 EC 4C子$ 0x4c,ESP%; ??
 80483a5:C7 45 F8 0C 00 00 00 $ MOVL 0xC的,-0x8(EBP%);移动12到-0x8(EBP%)
 80483ac:C7 45 F4 14 00 00 00 $ MOVL 0x14的,-0xc(EBP%);移动20到-0xc(EBP%)
 80483b3:8B 45 F8 MOV -0x8(EBP%),%EAX;移动12 @ -0x8(EBP%)到EAX
 80483b6:83 C0 7B加$ 0x7b,%EAX;加入123〜12 @ EAX
 80483b9:89 45 F4 MOV EAX%,-0xc(EBP%);结果存储到B @ -0xc(EBP%)
 80483bc:B8 00 00 00 00 $ MOV为0x0,%eax中;将0到EAX
 80483c1:83 10 C4加$ 0×10,ESP%; ??
 80483c4:59%的流行ECX; ??
 80483c5:5D流行的%ebp; ??
 80483c6:8D 61 FC LEA -0x4(ECX%),ESP%; ??


解决方案

堆栈增长的向下。 A 从堆栈指针(ESP)和减去流行增加了ESP。你必须记住这一点,了解了很多本。

  8048394:8D 4C 24 04 LEA为0x4(%ESP),ECX%; ??

LEA =负载有效地址

这节省了摆在4个字节进栈的东西的地址。由于这是32位(4字节字)的x86 code,这意味着在堆栈上的第二个项目。由于这是一个功能的code(在这种情况下,主)的4个字节是在堆栈的顶部是返回地址。

  8048398:83 E4 f0和$ 0xfffffff0,%ESP; ??

这code确保堆栈对齐到16字节。此操作后尤将小于或等于它是什么此操作之前,所以堆栈可能增长,可以保护任何可能已经在栈上。这是在有时做主要以防万一的函数调用未对齐的堆栈,这可能会导致事情是很慢(16字节的高速缓存线宽在x86,我认为,虽然4字节对齐是什么是真正重要的在这里)。如果主要有未对齐的堆栈程序的其余部分也会。

  804839b:FF 71的FC pushl -0x4(ECX%); ??

由于ECX之前被加载为指针的东西从堆栈previous顶部的返回地址的另一边,所以因为这有一个-4指数这是指返回到返回地址当前函数被推回堆栈的顶部,使主可正常返回。 (推魔法,似乎是能够同时装载和存储从在同一指令RAM不同的地方)。

  804839e:55推%EBP;存储基指针
 804839f:89 E5 MOV%ESP,EBP%;初始化堆栈指针的指针基地
 80483a1:51推%ECX; ??
 80483a2:83 EC 4C子$ 0x4c,ESP%; ??

这是主要的标准函数序言(在previous东西是特殊的主)。这是使局部变量能活(EBP和ESP之间的区域)堆栈帧。 EBP推让老栈帧可以在尾声恢复(在当前功能的末尾)。

  80483a5:C7 45 F8 0C 00 00 00 $ MOVL 0xC的,-0x8(EBP%);移动12到-0x8(EBP%)
80483ac:C7 45 F4 14 00 00 00 $ MOVL 0x14的,-0xc(EBP%);移动20到-0xc(EBP%)
80483b3:8B 45 F8 MOV -0x8(EBP%),%EAX;移动12 @ -0x8(EBP%)到EAX
80483b6:83 C0 7B加$ 0x7b,%EAX;加入123〜12 @ EAX
80483b9:89 45 F4 MOV EAX%,-0xc(EBP%);结果存储到B @ -0xc(EBP%)80483bc:B8 00 00 00 00 $ MOV为0x0,%eax中;将0到EAX

EAX就是整函数的返回值被存储。这是设置从主返回0。

  80483c1:83 10 C4加$ 0×10,ESP%; ??
80483c4:59%的流行ECX; ??
80483c5:5D流行的%ebp; ??
80483c6:8D 61 FC LEA -0x4(ECX%),ESP%; ??

这是函数尾声。这是因为在开始的怪异堆栈对齐code的更难理解。我有麻烦找出一点点,为什么栈是由一个较低的数额这段时间比调整的序幕,虽然。

如有明显的,这个特殊的code不与优化编译

据。如果它是有可能不会有太大的存在,因为编译器可以看到,即使它没有这样做在你的程序的最终结果是所列出的数学相同。与实际做项目做一些事情(有副作用或结果),它有时会更容易阅读轻轻优化code(-O1或-0s参数GCC)。

阅读汇编由编译器产生的往往是不在功能容易得多。如果你想阅读理解code,那么自己写一个函数,需要一些参数产生结果或工作在全局变量,你将能够更好地理解它。

这可能会帮助你的另一件事是只是还GCC生成汇编文件给你,而不是他们的拆卸。在 -S 标志告诉它来生成这个(但不产生其他文件),并命名了大会文件 .S 的结束。这应该是您更容易比反汇编版本读取。

I am trying to understand some assembly code and managed to finish most of it except a few lines. I am able to understand most of what is happening inside but am not able to fully understand what (and why it) is happening at the beginning and ending of the code. Can someone shed some light on this?

int main() {
    int a, b;
    a = 12;
    b = 20;
    b = a + 123;
    return 0;
}

Disassembled Version:

 8048394:8d 4c 24 04          lea    0x4(%esp),%ecx              ; ??
 8048398:83 e4 f0             and    $0xfffffff0,%esp            ; ??
 804839b:ff 71 fc             pushl  -0x4(%ecx)                  ; ??
 804839e:55                   push   %ebp                        ; Store the Base pointer
 804839f:89 e5                mov    %esp,%ebp                   ; Initialize the Base pointer with the stack pointer
 80483a1:51                   push   %ecx                        ; ??
 80483a2:83 ec 4c             sub    $0x4c,%esp                  ; ??
 80483a5:c7 45 f8 0c 00 00 00 movl   $0xc,-0x8(%ebp)             ; Move 12 into -0x8(%ebp)
 80483ac:c7 45 f4 14 00 00 00 movl   $0x14,-0xc(%ebp)            ; Move 20 into -0xc(%ebp)
 80483b3:8b 45 f8             mov    -0x8(%ebp),%eax             ; Move 12@-0x8(%ebp) into eax
 80483b6:83 c0 7b             add    $0x7b,%eax                  ; Add 123 to 12@eax
 80483b9:89 45 f4             mov    %eax,-0xc(%ebp)             ; Store the result into b@-0xc(%ebp)
 80483bc:b8 00 00 00 00       mov    $0x0,%eax                   ; Move 0 into eax
 80483c1:83 c4 10             add    $0x10,%esp                  ; ??
 80483c4:59                   pop    %ecx                        ; ??
 80483c5:5d                   pop    %ebp                        ; ??
 80483c6:8d 61 fc             lea    -0x4(%ecx),%esp             ; ??

解决方案

The stack grows downward. A push subtracts from the stack pointer (esp) and a pop adds to esp. You have to keep that in mind to understand a lot of this.

8048394:8d 4c 24 04          lea    0x4(%esp),%ecx              ; ??

lea = Load Effective Address

This saves the address of the thing that lies 4 bytes into the stack. Since this is 32 bit (4 byte word) x86 code that means the second item on the stack. Since this is the code of a function (main in this case) the 4 bytes that are at the top of the stack is the return address.

8048398:83 e4 f0             and    $0xfffffff0,%esp            ; ??

This code makes sure that the stack is aligned to 16 bytes. After this operation esp will be less than or equal to what it was before this operation, so the stack may grow, which protects anything that might already be on the stack. This is sometimes done in main just in case the function is called with an unaligned stack, which can cause things to be really slow (16 byte is a cache line width on x86, I think, though 4 byte alignment is what is really important here). If main has a unaligned stack the rest of the program will too.

 804839b:ff 71 fc             pushl  -0x4(%ecx)                  ; ??

Since ecx was loaded before as a pointer to the thing on the other side of the return address from the previous top of the stack, so since this has a -4 index this refers to back to the return address for the current function being pushed back to the top of the stack so that main can return normally. (Push is magic and seems to be able to both load and store from to different places in RAM in the same instruction).

 804839e:55                   push   %ebp                        ; Store the Base pointer
 804839f:89 e5                mov    %esp,%ebp                   ; Initialize the Base pointer with the stack pointer
 80483a1:51                   push   %ecx                        ; ??
 80483a2:83 ec 4c             sub    $0x4c,%esp                  ; ??

This is mostly the standard function prologue (the previous stuff was special for main). This is making a stack frame (area between ebp and esp) where local variables can live. ebp is pushed so that the old stack frame can be restored in the epilogue (at the end of the current function).

80483a5:c7 45 f8 0c 00 00 00 movl   $0xc,-0x8(%ebp)             ; Move 12 into -0x8(%ebp)
80483ac:c7 45 f4 14 00 00 00 movl   $0x14,-0xc(%ebp)            ; Move 20 into -0xc(%ebp)
80483b3:8b 45 f8             mov    -0x8(%ebp),%eax             ; Move 12@-0x8(%ebp) into eax
80483b6:83 c0 7b             add    $0x7b,%eax                  ; Add 123 to 12@eax
80483b9:89 45 f4             mov    %eax,-0xc(%ebp)             ; Store the result into b@-0xc(%ebp)

80483bc:b8 00 00 00 00       mov    $0x0,%eax                   ; Move 0 into eax

eax is where integer function return values are stored. This is setting up to return 0 from main.

80483c1:83 c4 10             add    $0x10,%esp                  ; ??
80483c4:59                   pop    %ecx                        ; ??
80483c5:5d                   pop    %ebp                        ; ??
80483c6:8d 61 fc             lea    -0x4(%ecx),%esp             ; ??

This is the function epilogue. It is more difficult to understand because of the weird stack alignment code at the beginning. I'm having a little bit of trouble figuring out why the stack is being adjusted by a lower amount this time than in the prologue, though.

It if obvious that this particular code was not compiled with optimizations on. If it were there probably wouldn't be much there since the compiler can see that even if it did not do the math listed in your main the end result of the program is the same. With programs that do actually do something (have side effects or results) it sometimes easier to read lightly optimized code (-O1 or -0s arguments to gcc).

Reading assembly generated by a compiler is often much easier for functions that aren't main. If you want to read to understand the code then write yourself a function that takes some arguments to produce a result or that works on global variables, and you will be able to understand it better.

Another thing that will probably help you is to just have gcc generate the assembly files for you, rather than disassembling them. The -S flag tells it to generate this (but not to generate other files), and names the assembly files with a .s on the end. This should be easier for you to read than the disassembled versions.

这篇关于了解一些汇编语句的目的的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆