在废为局部变量内存分配 [英] Waste in memory allocation for local variables

查看：221 发布时间：2016/7/18 20:49:26 c assembly x86 gdb

本文介绍了在废为局部变量内存分配的处理方法，对大家解决问题具有一定的参考价值，需要的朋友们下面随着小编来一起学习吧！

问题描述

这是我的计划：

void test_function(int a, int b, int c, int d){
    int flag;
    char buffer[10];

   flag = 31337;
   buffer[0] = 'A';
}

int main() {
    test_function(1, 2, 3, 4);
}

我编译这个程序的调试选项：

I compile this program with the debug option:

gcc -g my_program.c

我使用gdb的，我拆开了Intel语法test_function：

I use gdb and I disassemble the test_function with intel syntax:

(gdb) disassemble test_function
Dump of assembler code for function test_function:
0x08048344 <test_function+0>:   push   ebp
0x08048345 <test_function+1>:   mov    ebp,esp
0x08048347 <test_function+3>:   sub    esp,0x28
0x0804834a <test_function+6>:   mov    DWORD PTR [ebp-12],0x7a69
0x08048351 <test_function+13>:  mov    BYTE PTR [ebp-40],0x41
0x08048355 <test_function+17>:  leave  
0x08048356 <test_function+18>:  ret    
End of assembler dump.

和我拆开主：

(gdb) disassemble main
Dump of assembler code for function main:
0x08048357 <main+0>:    push   ebp
0x08048358 <main+1>:    mov    ebp,esp
0x0804835a <main+3>:    sub    esp,0x18
0x0804835d <main+6>:    and    esp,0xfffffff0
0x08048360 <main+9>:    mov    eax,0x0
0x08048365 <main+14>:   sub    esp,eax
0x08048367 <main+16>:   mov    DWORD PTR [esp+12],0x4
0x0804836f <main+24>:   mov    DWORD PTR [esp+8],0x3
0x08048377 <main+32>:   mov    DWORD PTR [esp+4],0x2
0x0804837f <main+40>:   mov    DWORD PTR [esp],0x1
0x08048386 <main+47>:   call   0x8048344 <test_function>
0x0804838b <main+52>:   leave  
0x0804838c <main+53>:   ret    
End of assembler dump.

我把一个断点在这个地址：0x08048355（离开的test_function指令）和我运行程序

I place a breakpoint at this adresse: 0x08048355 (leave instruction for the test_function) and I run the program.

我看堆栈是这样的：

(gdb) x/16w $esp
0xbffff7d0:     0x00000041      0x08049548      0xbffff7e8      0x08048249
0xbffff7e0:     0xb7f9f729      0xb7fd6ff4      0xbffff818      0x00007a69
0xbffff7f0:     0xb7fd6ff4      0xbffff8ac      0xbffff818      0x0804838b
0xbffff800:     0x00000001      0x00000002      0x00000003      0x00000004

0x0804838b是返回ADRESS，0xbffff818被保存的帧指针（EBP主）和标志变量，则备有12个字节进一步。为什么12？

0x0804838b is the return adress, 0xbffff818 is the saved frame pointer (main ebp) and flag variable is stocked 12 bytes further. Why 12?

我不明白这个指令：

0x0804834a <test_function+6>:   mov    DWORD PTR [ebp-12],0x7a69

为什么我们不炒股内容的变量0x00007a69在EBP-4，而不是0xbffff8ac？

Why we don't stock the content's variable 0x00007a69 in ebp-4 instead of 0xbffff8ac?

缓冲区同样的问题。为什么40？

Same question for buffer. Why 40?

我们不要浪费内存？ 0xb7fd6ff4 0xbffff8ac和0xb7f9f729 0xb7fd6ff4 0xbffff818 0x08049548 0xbffff7e8 0x08048249不习惯？

We don't waste the memory? 0xb7fd6ff4 0xbffff8ac and 0xb7f9f729 0xb7fd6ff4 0xbffff818 0x08049548 0xbffff7e8 0x08048249 are not used?

本的输出命令 GCC -Q -v -g my_program.c ：

Reading specs from /usr/lib/gcc-lib/i486-linux-gnu/3.3.6/specs
Configured with: ../src/configure -v --enable-languages=c,c++ --prefix=/usr --mandir=/usr/share/man --infodir=/usr/share/info --with-gxx-include-dir=/usr/include/c++/3.3 --enable-shared --enable-__cxa_atexit --with-system-zlib --enable-nls --without-included-gettext --enable-clocale=gnu --enable-debug i486-linux-gnu
Thread model: posix
gcc version 3.3.6 (Ubuntu 1:3.3.6-15ubuntu1)
 /usr/lib/gcc-lib/i486-linux-gnu/3.3.6/cc1 -v -D__GNUC__=3 -D__GNUC_MINOR__=3 -D__GNUC_PATCHLEVEL__=6 notesearch.c -dumpbase notesearch.c -auxbase notesearch -g -version -o /tmp/ccGT0kTf.s
GNU C version 3.3.6 (Ubuntu 1:3.3.6-15ubuntu1) (i486-linux-gnu)
        compiled by GNU C version 3.3.6 (Ubuntu 1:3.3.6-15ubuntu1).
GGC heuristics: --param ggc-min-expand=99 --param ggc-min-heapsize=129473
options passed:  -v -D__GNUC__=3 -D__GNUC_MINOR__=3 -D__GNUC_PATCHLEVEL__=6
 -auxbase -g
options enabled:  -fpeephole -ffunction-cse -fkeep-static-consts
 -fpcc-struct-return -fgcse-lm -fgcse-sm -fsched-interblock -fsched-spec
 -fbranch-count-reg -fcommon -fgnu-linker -fargument-alias
 -fzero-initialized-in-bss -fident -fmath-errno -ftrapping-math -m80387
 -mhard-float -mno-soft-float -mieee-fp -mfp-ret-in-387
 -maccumulate-outgoing-args -mcpu=pentiumpro -march=i486
ignoring nonexistent directory "/usr/local/include/i486-linux-gnu"
ignoring nonexistent directory "/usr/i486-linux-gnu/include"
ignoring nonexistent directory "/usr/include/i486-linux-gnu"
#include "..." search starts here:
#include <...> search starts here:
 /usr/local/include
 /usr/lib/gcc-lib/i486-linux-gnu/3.3.6/include
 /usr/include
End of search list.
 gnu_dev_major gnu_dev_minor gnu_dev_makedev stat lstat fstat mknod fatal ec_malloc dump main print_notes find_user_note search_note
Execution times (seconds)
 preprocessing         :   0.00 ( 0%) usr   0.01 (25%) sys   0.00 ( 0%) wall
 lexical analysis      :   0.00 ( 0%) usr   0.01 (25%) sys   0.00 ( 0%) wall
 parser                :   0.02 (100%) usr   0.01 (25%) sys   0.00 ( 0%) wall
 TOTAL                 :   0.02             0.04             0.00
 as -V -Qy -o /tmp/ccugTYeu.o /tmp/ccGT0kTf.s
GNU assembler version 2.17.50 (i486-linux-gnu) using BFD version 2.17.50 20070103 Ubuntu
 /usr/lib/gcc-lib/i486-linux-gnu/3.3.6/collect2 --eh-frame-hdr -m elf_i386 -dynamic-linker /lib/ld-linux.so.2 /usr/lib/gcc-lib/i486-linux-gnu/3.3.6/../../../crt1.o /usr/lib/gcc-lib/i486-linux-gnu/3.3.6/../../../crti.o /usr/lib/gcc-lib/i486-linux-gnu/3.3.6/crtbegin.o -L/usr/lib/gcc-lib/i486-linux-gnu/3.3.6 -L/usr/lib/gcc-lib/i486-linux-gnu/3.3.6/../../.. /tmp/ccugTYeu.o -lgcc --as-needed -lgcc_s --no-as-needed -lc -lgcc --as-needed -lgcc_s --no-as-needed /usr/lib/gcc-lib/i486-linux-gnu/3.3.6/crtend.o /usr/lib/gcc-lib/i486-linux-gnu/3.3.6/../../../crtn.o

请注意：我读的书开发的艺术< /一>我用的是VM与书提供。

NOTE: I read the book "The art of exploitation" and I use the VM provides with the book.

推荐答案

编译器试图保持堆栈16字节对齐。这也适用于32位code这几天（不只是64位）。我们的想法是，在执行的 CALL前的点的指令栈必须对齐到16字节边界。

The compiler is trying to maintain 16 byte alignment on the stack. This also applies to 32-bit code these days (not just 64-bit). The idea is that at the point before executing a CALL instruction the stack must be aligned to a 16-byte boundary.

由于您没有优化也有一些多余的指令编译。

Because you compiled with no optimizations there are some extraneous instructions.

0x0804835a <main+3>:    sub    esp,0x18        ; Allocate local stack space
0x0804835d <main+6>:    and    esp,0xfffffff0  ; Ensure `main` has a 16 byte aligned stack
0x08048360 <main+9>:    mov    eax,0x0         ; Extraneous, not needed
0x08048365 <main+14>:   sub    esp,eax         ; Extraneous, not needed

的 ESP 的是现在上面的最后一条指令后对齐16字节。我们移动的参数开始于堆叠的顶部处的 ESP 的呼叫。这与做的：

ESP is now 16-byte aligned after the last instruction above. We move the parameters for the call starting at the top of the stack at ESP. That is done with:

0x08048367 <main+16>:   mov    DWORD PTR [esp+12],0x4
0x0804836f <main+24>:   mov    DWORD PTR [esp+8],0x3
0x08048377 <main+32>:   mov    DWORD PTR [esp+4],0x2
0x0804837f <main+40>:   mov    DWORD PTR [esp],0x1

的呼叫的再压入堆栈上一个4字节的返回地址。然后，我们呼叫后达到下列指示：

The CALL then pushes a 4 byte return address on the stack. We then reach these instructions after the call:

0x08048344 <test_function+0>:   push   ebp     ; 4 bytes pushed on stack
0x08048345 <test_function+1>:   mov    ebp,esp ; Setup stackframe

此推栈上另外4个字节。与来自返回地址的4个字节我们现在由8个字节对齐。为了达到16字节对齐，我们再次需要在堆栈上浪费额外的8个字节。这就是为什么在这句话有一个额外的8个字节分配：

This pushes another 4 bytes on the stack. With the 4 bytes from the return address we are now misaligned by 8 bytes. To reach 16-byte alignment again we will need to waste an additional 8 bytes on the stack. That is why in this statement there is an additional 8 bytes allocated:

0x08048347 <test_function+3>:   sub    esp,0x28

已经在堆栈

0x08的字节，因为返回地址（4字节）和 EBP 的（4字节）

需要

0x08的填充字节，以对齐叠回16字节对齐

需要局部变量分配= 32字节

0x20的字节。
32/16是整除16，从而保持对齐

0x08 bytes already on stack because of return address(4-bytes) and EBP(4 bytes)

0x08 bytes of padding needed to align stack back to 16-byte alignment

0x20 bytes needed for local variable allocation = 32 bytes. 32/16 is evenly divisible by 16 so alignment maintained

以上加在一起，第二个和第三个号码是由编译器计算和使用的值0x28 子ESP，0x28 。

The second and third number above added together is the value 0x28 computed by the compiler and used in sub esp,0x28.

0x0804834a <test_function+6>:   mov    DWORD PTR [ebp-12],0x7a69

那么，为什么 [EBP-12] 在该指令？前8个字节 [EBP-8] 到 [EBP-1] 的对齐字节用来获取堆栈16字节对齐。然后，将本地数据后会出现堆栈上。在这种情况下 [EBP-12] 到 [EBP-10] 都为32位整数的4个字节标志。

So why [ebp-12] in this instruction? The first 8 bytes [ebp-8] through [ebp-1] are the alignment bytes used to get the stack 16-byte aligned. The local data will then appear on the stack after that. In this case [ebp-12] through [ebp-9] are the 4 bytes for the 32-bit integer flag.

然后我们有这个更新缓冲[0] 字符'A'：

Then we have this for updating buffer[0] with the character 'A':

0x08048351 <test_function+13>:  mov    BYTE PTR [ebp-40],0x41

古怪那么就可以解释为什么字符的10字节数组将从 [EBP + 40] （数组的开头）到 [EBP出现+13] 这是28个字节。我可以做的最好的猜测是，编译器认为它可以治疗10字节的字符数组为128位（16字节）载体。这将强制编译器对齐一个16字节的边界上的缓冲器，以及垫阵列到16个字节（128位）。从编译器的角度来看，您的code似乎作用更像它被定义为：

The oddity then would be why a 10 byte array of characters would appear from [ebp+40](beginning of array) to [ebp+13] which is 28 bytes. The best guess I can make is that compiler felt that it could treat the 10 byte character array as a 128-bit (16-byte) vector. This would force the compiler to align the buffer on a 16 byte boundary, and pad the array out to 16 bytes (128-bits). From the perspective of the compiler, your code seems to be acting much like it was defined as:

#include <xmmintrin.h>
void test_function(int a, int b, int c, int d){
    int flag;
    union {
        char buffer[10];
        __m128 m128buffer;      ; 16-byte variable that needs to be 16-bytes aligned
    } bufu;

   flag = 31337;
   bufu.buffer[0] = 'A';
}

有关GCC 4.9.0在 GodBolt输出产生32位code用的 SSE2 的启用会显示如下：

The output on GodBolt for GCC 4.9.0 generating 32-bit code with SSE2 enabled appears as follows:

test_function:
        push    ebp     #
        mov     ebp, esp  #, 
        sub     esp, 40   #,same as: sub esp,0x28
        mov     DWORD PTR [ebp-12], 31337 # flag,
        mov     BYTE PTR [ebp-40], 65     # bufu.buffer,
        leave
        ret

这看起来非常相似，你的拆装的 GDB 的

This looks very similar to your disassembly in GDB.

如果您使用的优化（如 -O1 编译 -O2 ， - O3 ），优化器可能已经简化 test_function ，因为它是在你的榜样叶函数。叶函数是一个不调用另一个函数。某些快捷方式可能已经由编译器施加

If you compiled with optimizations (such as -O1, -O2, -O3), the optimizer could have simplified test_function because it is a leaf function in your example. A leaf function is one that doesn't call another function. Certain shortcuts could have been applied by the compiler.

至于为什么字符数组似乎对齐到16字节边界并填充为16个字节？这可能不能肯定，直到我们知道什么回答的 GCC 的编译器使用的是（的gcc --version 会告诉你）。这也将是了解你的OS和OS版本是有用的。更妙的是，从这个命令的输出添加到您的问题 GCC -Q -v -g my_program.c

As for why the character array seems to be aligned to a 16-byte boundary and padded to be 16 bytes? That probably can't be answered with certainty until we know what GCC compiler you are using (gcc --version will tell you). It would also be useful to know your OS and OS version. Even better would be to add the output from this command to your question gcc -Q -v -g my_program.c

这篇关于在废为局部变量内存分配的文章就介绍到这了，希望我们推荐的答案对大家有所帮助，也希望大家多多支持IT屋！

查看全文

在废为局部变量内存分配 [英] Waste in memory allocation for local variables

问题描述

推荐答案

相关文章

.NET Framework最新文章

热门教程

热门工具

登录关闭

在废为局部变量内存分配 [英] Waste in memory allocation for local variables

问题描述

推荐答案

相关文章

.NET Framework最新文章

热门教程

热门工具

登录 关闭

登录关闭