为什么 GCC pad 与 NOP 一起工作? [英] Why does GCC pad functions with NOPs?
问题描述
我使用 C 已经有一段时间了,最近才开始接触 ASM.当我编译程序时:
I've been working with C for a short while and very recently started to get into ASM. When I compile a program:
int main(void)
{
int a = 0;
a += 1;
return 0;
}
objdump 反汇编有代码,但是在 ret 之后 nops:
The objdump disassembly has the code, but nops after the ret:
...
08048394 <main>:
8048394: 55 push %ebp
8048395: 89 e5 mov %esp,%ebp
8048397: 83 ec 10 sub $0x10,%esp
804839a: c7 45 fc 00 00 00 00 movl $0x0,-0x4(%ebp)
80483a1: 83 45 fc 01 addl $0x1,-0x4(%ebp)
80483a5: b8 00 00 00 00 mov $0x0,%eax
80483aa: c9 leave
80483ab: c3 ret
80483ac: 90 nop
80483ad: 90 nop
80483ae: 90 nop
80483af: 90 nop
...
据我所知,nops 什么都不做,因为在 ret 之后甚至不会被执行.
From what I learned nops do nothing, and since after ret wouldn't even be executed.
我的问题是:为什么要打扰?ELF(linux-x86) 不能与任何大小的 .text 部分(+main)一起使用吗?
My question is: why bother? Couldn't ELF(linux-x86) work with a .text section(+main) of any size?
我很感激任何帮助,只是想学习.
I'd appreciate any help, just trying to learn.
推荐答案
首先,gcc
并不总是这样做.填充由 -falign-functions
,由-O2
和-O3
自动开启:
First of all, gcc
doesn't always do this. The padding is controlled by -falign-functions
, which is automatically turned on by -O2
and -O3
:
-falign-functions
-falign-functions=n
将函数的开头对齐到下一个大于 n
的 2 的幂,最多跳过 n
个字节.例如,-falign-functions=32
将函数对齐到下一个 32 字节边界,但 -falign-functions=24
只会对齐到下一个 32 字节边界如果这可以通过跳过 23 个或更少字节来完成.
Align the start of functions to the next power-of-two greater than n
, skipping up to n
bytes. For instance,
-falign-functions=32
aligns functions to the next 32-byte boundary, but -falign-functions=24
would align to the next 32-byte boundary only
if this can be done by skipping 23 bytes or less.
-fno-align-functions
和 -falign-functions=1
是等价的,意味着函数不会对齐.
-fno-align-functions
and -falign-functions=1
are equivalent and mean that functions will not be aligned.
某些汇编器仅在 n 是 2 的幂时才支持此标志;在在这种情况下,它被四舍五入.
Some assemblers only support this flag when n is a power of two; in that case, it is rounded up.
如果 n 未指定或为零,则使用与机器相关的默认值.
If n is not specified or is zero, use a machine-dependent default.
在 -O2、-O3 级别启用.
Enabled at levels -O2, -O3.
这样做可能有多种原因,但 x86 上的主要原因可能是:
There could be multiple reasons for doing this, but the main one on x86 is probably this:
大多数处理器在对齐的 16 字节或 32 字节块中获取指令.有可能有利于将关键循环条目和子程序条目对齐 16 以最小化代码中 16 字节边界的数量.或者,确保在关键循环入口或子程序入口之后的前几条指令中没有 16 字节的边界.
Most processors fetch instructions in aligned 16-byte or 32-byte blocks. It can be advantageous to align critical loop entries and subroutine entries by 16 in order to minimize the number of 16-byte boundaries in the code. Alternatively, make sure that there is no 16-byte boundary in the first few instructions after a critical loop entry or subroutine entry.
(引自《优化汇编中的子程序》语言",作者:Agner Fog.)
(Quoted from "Optimizing subroutines in assembly language" by Agner Fog.)
这是一个演示填充的示例:
edit: Here is an example that demonstrates the padding:
// align.c
int f(void) { return 0; }
int g(void) { return 0; }
使用默认设置的 gcc 4.4.5 编译时,我得到:
When compiled using gcc 4.4.5 with default settings, I get:
align.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <f>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: b8 00 00 00 00 mov $0x0,%eax
9: c9 leaveq
a: c3 retq
000000000000000b <g>:
b: 55 push %rbp
c: 48 89 e5 mov %rsp,%rbp
f: b8 00 00 00 00 mov $0x0,%eax
14: c9 leaveq
15: c3 retq
指定 -falign-functions
给出:
align.o: file format elf64-x86-64
Disassembly of section .text:
0000000000000000 <f>:
0: 55 push %rbp
1: 48 89 e5 mov %rsp,%rbp
4: b8 00 00 00 00 mov $0x0,%eax
9: c9 leaveq
a: c3 retq
b: eb 03 jmp 10 <g>
d: 90 nop
e: 90 nop
f: 90 nop
0000000000000010 <g>:
10: 55 push %rbp
11: 48 89 e5 mov %rsp,%rbp
14: b8 00 00 00 00 mov $0x0,%eax
19: c9 leaveq
1a: c3 retq
这篇关于为什么 GCC pad 与 NOP 一起工作?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!