Tiny C编译器生成的代码会发出额外的(不必要的)NOP和JMP [英] Tiny C Compiler's generated code emits extra (unnecessary?) NOPs and JMPs

查看:164
本文介绍了Tiny C编译器生成的代码会发出额外的(不必要的)NOP和JMP的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有人可以解释为什么这样的代码:

Can someone explain why this code:

#include <stdio.h>

int main()
{
  return 0;
}

在使用 tcc代码.c 进行tcc编译时,会生成以下asm:

when compiled with tcc using tcc code.c produces this asm:

00401000  |.  55               PUSH EBP
00401001  |.  89E5             MOV EBP,ESP
00401003  |.  81EC 00000000    SUB ESP,0
00401009  |.  90               NOP
0040100A  |.  B8 00000000      MOV EAX,0
0040100F  |.  E9 00000000      JMP fmt_vuln1.00401014
00401014  |.  C9               LEAVE
00401015  |.  C3               RETN

我猜是

00401009  |.  90   NOP

也许在那里可以进行一些内存对齐,但是那怎么办

is maybe there for some memory alignment, but what about

0040100F  |.  E9 00000000     JMP fmt_vuln1.00401014
00401014  |.  C9              LEAVE

我的意思是为什么编译器为什么要插入这个跳转到 next 指令的近跳转,而LEAVE仍然会执行?

I mean why would compiler insert this near jump that jumps to the next instruction, LEAVE would execute anyway?

我在64位Windows上使用TCC 0.9.26生成32位可执行文件.

I'm on 64-bit Windows generating 32-bit executable using TCC 0.9.26.

推荐答案

函数结尾之前的多余JMP

下一个语句底部的 JMP ,它是版本0.9.27 解决了此问题:

Superfluous JMP before the Function Epilogue

The JMP at the bottom that goes to the next statement, this was fixed in a commit. Version 0.9.27 of TCC resolves this issue:

当"return"是顶级块的最后一条语句时 (非常常见且经常推荐的情况)是不需要的.

When 'return' is the last statement of the top-level block (very common and often recommended case) jump is not needed.

由于它首先存在的原因?这个想法是每个功能都有一个可能的公共出口点.如果在底部有一个返回的代码块,则 JMP 会到达公共退出点,在该退出点完成堆栈清理并执行ret.最初,如果代码生成器恰好出现在最后一个}(右括号)之前,它也会在函数末尾错误地发出 JMP 指令.该修复程序检查函数的顶层是否存在return语句,后跟右括号.如果存在,将省略 JMP

As for the reason it existed in the first place? The idea is that each function has a possible common exit point. If there is a block of code with a return in it at the bottom, the JMP goes to a common exit point where stack cleanup is done and the ret is executed. Originally the code generator also emitted the JMP instruction erroneously at the end of the function too if it appeared just before the final } (closing brace). The fix checks to see if there is a return statement followed by a closing brace at the top level of the function. If there is, the JMP is omitted

一个示例,该示例在右大括号之前具有较低范围的返回值:

An example of code that has a return at a lower scope before a closing brace:

int main(int argc, char *argv[])
{
  if (argc == 3) {
      argc++;
      return argc;
  }
  argc += 3;
  return argc;
}

生成的代码如下:

  401000:       55                      push   ebp
  401001:       89 e5                   mov    ebp,esp
  401003:       81 ec 00 00 00 00       sub    esp,0x0
  401009:       90                      nop
  40100a:       8b 45 08                mov    eax,DWORD PTR [ebp+0x8]
  40100d:       83 f8 03                cmp    eax,0x3
  401010:       0f 85 11 00 00 00       jne    0x401027
  401016:       8b 45 08                mov    eax,DWORD PTR [ebp+0x8]
  401019:       89 c1                   mov    ecx,eax
  40101b:       40                      inc    eax
  40101c:       89 45 08                mov    DWORD PTR [ebp+0x8],eax
  40101f:       8b 45 08                mov    eax,DWORD PTR [ebp+0x8]

  ; Jump to common function exit point. This is the `return argc` inside the if statement
  401022:       e9 11 00 00 00          jmp    0x401038

  401027:       8b 45 08                mov    eax,DWORD PTR [ebp+0x8]
  40102a:       83 c0 03                add    eax,0x3
  40102d:       89 45 08                mov    DWORD PTR [ebp+0x8],eax
  401030:       8b 45 08                mov    eax,DWORD PTR [ebp+0x8]

  ; Jump to common function exit point. This is the `return argc` at end of the function 
  401033:       e9 00 00 00 00          jmp    0x401038

  ; Common function exit point
  401038:       c9                      leave
  401039:       c3                      ret

在版本之前至0.9.27的版本中,if语句内的return argc会跳到公共退出点(函数结尾).同样,函数底部的return argc也跳到该函数的相同公共出口点.问题在于该函数的公共退出点恰好在顶级return argc之后,因此副作用是恰好在下一条指令上的一个额外的JMP.

In versions prior to 0.9.27 the return argc inside the if statement would jump to a common exit point (function epilogue). As well the return argc at the bottom of the function also jumps to the same common exit point of the function. The problem is that the common exit point for the function happens to be right after the top level return argcso the side effect is an extra JMP that happens to be to the next instruction.

NOP 不适合对齐.由于Windows实施堆栈保护页面的方式(可移植可执行格式的程序)TCC有两种类型的序言.如果需要本地堆栈空间< 4096(小于一页),您会看到生成的这种代码:

The NOP isn't for alignment. Because of the way Windows implements guard pages for the stack (Programs that are in Portable Executable format) TCC has two types of prologues. If the local stack space required < 4096 (smaller than a single page) then you see this kind of code generated:

401000:       55                      push   ebp
401001:       89 e5                   mov    ebp,esp
401003:       81 ec 00 00 00 00       sub    esp,0x0

sub esp,0尚未优化.它是局部变量所需的堆栈空间量(在这种情况下为0).如果添加一些局部变量,您将在 SUB 指令中看到0x0更改为与局部变量所需的堆栈空间一致.此序言需要9个字节.还有另一个序言可以处理所需的堆栈空间> = 4096字节的情况.如果您添加具有以下内容的4096字节数组:

The sub esp,0 isn't optimized out. It is the amount of stack space needed for local variables (in this case 0). If you add some local variables you will see the 0x0 in the SUB instruction changes to coincide with the amount of stack space needed for local variables. This prologue requires 9 bytes. There is another prologue to handle the case where the stack space needed is >= 4096 bytes. If you add an array of 4096 bytes with something like:

char somearray[4096] 

并查看结果指令,您将看到函数序言更改为10字节序言:

and look at the resulting instruction you will see the function prologue change to a 10 byte prologue:

401000:       b8 00 10 00 00          mov    eax,0x1000
401005:       e8 d6 00 00 00          call   0x4010e0

TCC的代码生成器假定以WinPE为目标时,函数序言始终为10个字节.这主要是因为TCC是单遍编译器.直到之后处理完函数,编译器才知道该函数将使用多少堆栈空间.为了解决这个问题,TCC预先为序言分配了10个字节,以适应最大的方法.较短的内容将填充为10个字节.

TCC's code generator assumes that the function prologue is always 10 bytes when targeting WinPE. This is primarily because TCC is a single pass compiler. The compiler doesn't know how much stack space a function will use until after the function is processed. To get around not knowing this ahead of time, TCC pre-allocates 10 bytes for the prologue to fit the largest method. Anything shorter is padded to 10 bytes.

在需要堆栈空间的情况下< 4096个字节,所使用的指令总共9个字节. NOP 用于将序言填充到10个字节.对于需要> = 4096字节的情况,字节数在 EAX 中传递,而函数

In the case where stack space needed < 4096 bytes the instructions used total 9 bytes. The NOP is used to pad the prologue to 10 bytes. For the case where >= 4096 bytes are needed, the number of bytes is passed in EAX and the function __chkstk is called to allocate the required stack space instead.

这篇关于Tiny C编译器生成的代码会发出额外的(不必要的)NOP和JMP的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆