预期的缓冲区溢出并不总是导致程序崩溃 [英] An intended buffer overflow that does not always cause the program to crash

查看:243
本文介绍了预期的缓冲区溢出并不总是导致程序崩溃的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

考虑以下最小的C程序:

Consider The following minimal C program:

案例编号1 :

#include <stdio.h>
#include <string.h>

void foo(char* s)
{
    char buffer[10];
    strcpy(buffer,s);
}

int main(void)
{
    foo("01234567890134567");
}

这不会导致崩溃转储

如果仅添加一个字符,那么新的主要名称是:

If add just one character, so the new main is:

案例编号2 :

void main()
{
    foo("012345678901345678");
                          ^   
}

程序因分段错误而崩溃.

The program crashes with a Segmentation fault.

除了堆栈中保留的10个字符外,还有一个额外的空间可容纳8个其他字符.因此,第一个程序不会崩溃.但是,如果再添​​加一个字符,则会开始访问无效的内存.我的问题是:

Looks like additionally to the 10 characters reserved in the stack there's an additional room for 8 additional characters. Thus the first program doesn't crash. However, if you add one more character you start accessing invalid memory. My questions are:

  1. 为什么我们要在堆栈中保留这8个额外的字符?
  2. 这与内存中的char数据类型对齐有某种联系吗?

在这种情况下,我还有一个疑问:操作系统(在这种情况下为Windows)如何检测到错误的内存访问?通常,根据Windows文档,默认堆栈大小为1MB 堆栈大小.因此,我看不到操作系统如何检测到正在访问的地址不在进程内存中,特别是当最小页面大小通常为4k时.在这种情况下,操作系统是否使用SP来检查地址?

An other doubt I have in this case is how does the OS (Windows in this case) detects the bad memory access? Normally as per the Windows documentation the default stack size is 1MB Stack Size. So I don't see how the OS detects that the address being accessed is outside the process memory specially when the minimum page size is normally 4k. Does the OS use the SP in this case to check the address?

PD:我正在使用以下环境进行测试
Cygwin
GCC 4.8.3
Windows 7操作系统

PD: I'm using the following environment for the testing
Cygwin
GCC 4.8.3
Windows 7 OS

编辑:

这是从 http://gcc.godbolt.org/# 生成的程序集,但使用GCC 4.8.2,在可用的编译器中看不到GCC 4.8.3.但是我猜生成的代码应该是相似的.我建立了没有任何标志的代码.我希望具有汇编专业知识的人可以阐明foo函数中发生的情况以及为什么多余的char会导致seg错误

This is the generated assembly from http://gcc.godbolt.org/# but using GCC 4.8.2, I can't see the GCC 4.8.3 in the available compilers. But I guess the generated code should be similar. I built the code without any flags. I hope somebody with Assembly expertise could shed some light about what's happening in the foo function and why the extra char causes the seg fault

    foo(char*):
    pushq   %rbp
    movq    %rsp, %rbp
    subq    $48, %rsp
    movq    %rdi, -40(%rbp)
    movq    %fs:40, %rax
    movq    %rax, -8(%rbp)
    xorl    %eax, %eax
    movq    -40(%rbp), %rdx
    leaq    -32(%rbp), %rax
    movq    %rdx, %rsi
    movq    %rax, %rdi
    call    strcpy
    movq    -8(%rbp), %rax
    xorq    %fs:40, %rax
    je  .L2
    call    __stack_chk_fail
.L2:
    leave
    ret
.LC0:
    .string "01234567890134567"
main:
    pushq   %rbp
    movq    %rsp, %rbp
    movl    $.LC0, %edi
    call    foo(char*)
    movl    $0, %eax
    popq    %rbp
    ret

推荐答案

与系统无关的官方答案是:

The official, system agnostic answer is:

您的代码写的数据超出了目标数组的末尾,行为是不确定的,任何事情都可能发生,包括一无所有在火星表面撞毁的太空探测器.在缓冲区末尾最多8个字节处观察不到明显的影响,而超出分段错误的崩溃可能是未定义行为的可能影响,完全在预期结果之内.

Your code writes data beyond the end of the destination array, the behaviour is undefined, anything can happen, including nothing at all or space probe crashed on Mars surface. Your observing no noticeable effect up to 8 bytes beyond the end of the buffer and a crash with a segmentation fault beyond that are possible effects of undefined behaviour, well within the expected outcome.

您感兴趣的其他实施详细信息:

The extra implementation details you are interested in:

实际行为将取决于许多情况,例如您使用哪个编译器,哪个OS和ABI(应用程序二进制接口)等.

Actual behaviour will depend on many circumstances, for example which compiler you use, which OS and ABI (Application Binary Interface) etc.

您的程序在64位Windows环境中编译和执行.在这种环境下,堆栈在64位边界或可能在16字节边界上保持对齐,以允许直接从堆栈位置加载MMX寄存器和将MMX寄存器直接存储到堆栈位置.数组buffer[10]在堆栈上占用16个字节.考虑到如何在该处理器上建立堆栈,堆栈将位于函数foo用来存储任何保存的寄存器和返回地址到调用方函数main中的位置的正下方.额外的6个字节是在数组之前还是之后是编译器的选择.它可以将这个空间用于其他局部变量,或者只是忽略它.

Your program is compiled and executed in a 64 bit Windows environment. In this environement, the stack is kept aligned on 64 bit boundaries, or possibly 16 byte boundaries to allow direct loading and storing of the MMX registers from/to stack locations. The array buffer[10] occupies 16 bytes on the stack. Given how the stack is established on this processor, it will be located just below locations used by function foo to store any saved registers and the return address into the caller function main. Whether the extra 6 bytes are before or after the array is a choice for the compiler to make. It could use this space for other local variables or just ignore it.

buffer末尾写操作对于最多6个字节可能是无害的,如果填充是在数组之后,则对于另外8个字节可能没有任何明显的影响(将已保存的rbp寄存器弄乱,该寄存器未在main在调用之后),但除此之外会开始产生不良影响,因为您将覆盖寄信人地址.

Writing beyond the end of buffer may be harmless for up to 6 bytes if the padding is after the array, might not have any noticeable effect for another 8 bytes (clobbering the saved rbp register, which is unused in main after the call), but will start having bad side effects beyond that, because you will be overwriting the return address.

当您覆盖返回地址时,处理器不会从函数foo返回到调用方main,而是返回到堆栈中存储的任何地址,并且该地址已被有问题的代码破坏.如果此损坏的地址指向可执行代码,则该代码将被执行,并可能造成潜在的有害后果……黑客正是这样做的:他们精心设计了一种漏洞利用程序,该漏洞设法将一些有害代码存储在可执行内存中的已知位置,并利用这些漏洞.缓冲区溢出代码,用于将所述代码的地址存储在返回地址的堆栈位置中.

When you overwrite the return address, the processor will not return from function foo to the caller main, but to whatever address is stored on the stack and was corrupted by the offending code. If this corrupted address points to executable code, that code will be executed with potential harmful consequences... Hackers do exactly this: they carefuly craft an exploit that manages to store some harmful code at a known location in executable memory and take advantage of the buffer overflow code to store the address of said code in the stack location for the return address.

在您的情况下,损坏的返回地址所指向的位置可能无法执行,从而触发您观察到的分段错误.

In your case, the location pointed to by the corrupted return address might not be executable, triggering the segmentation fault you observe.

我建议您尝试在此站点上编译代码,以查看在各种编译器选项下生成的实际汇编代码: http ://gcc.godbolt.org/#

I suggest your try and compile your code on this site to see the actual assembly code generated under various compiler options: http://gcc.godbolt.org/#

这篇关于预期的缓冲区溢出并不总是导致程序崩溃的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆