找出堆内存损坏的位置 [英] Find out where heap memory gets corrupted

查看:42
本文介绍了找出堆内存损坏的位置的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我知道已经有很多类似的问题和答案存在,但我无法解决我的问题.

I know there are already many similar questions and answers exist, but I am not able to solve my problem.

在我的大型应用程序堆中,某处已损坏,我无法找到它.我也使用过 gflags 之类的工具,但没有运气.

In my big application heap is getting corrupted somewhere and I am not able to locate it. I used tool like gflags also but no luck.

我在以下示例中尝试了 gflags,它故意破坏了堆:

I tried gflags on the following sample which corrupts the heap by purpose:

char* pBuffer = new char[256];
memset(pBuffer, 0, 256 + 1);
delete[] pBuffer;

在第 2 行堆被覆盖,但如何通过 gflags、windbg 等工具找到它.可能是我没有正确使用 gflags.

At line#2 heap is overwritten but how to find it via tools like gflags, windbg etc. May be I am not using the gflags properly.

推荐答案

如果自动化工具(如电围栏或 valgrind)不能解决问题,请专注地盯着你的代码,试图找出它可能去哪里了错误无济于事,并且禁用/启用各种操作(直到您获得堆损坏的存在与事先执行或未执行的操作之间的相关性)以缩小它似乎不起作用,您可以随时尝试这种技术试图尽早发现损坏,以便更轻松地追踪源头:

If automated tools (like electric fence or valgrind) don't do the trick, and staring intently at your code to try and figure out where it might have gone wrong doesn't help, and disabling/enabling various operations (until you get a correlation between the presence of heap-corruption and what operations did or didn't execute beforehand) to narrow it doesn't seem to work, you can always try this technique, which attempts to find the corruption sooner rather than later, so as to make it easier to track down the source:

创建您自己的自定义 new 和 delete 运算符,在分配的内存区域周围放置明显损坏的保护区域,如下所示:

Create your own custom new and delete operators that put corruption-evident guard areas around the allocated memory regions, something like this:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <new>

// make this however big you feel is "big enough" so that corrupted bytes will be seen in the guard bands
static int GUARD_BAND_SIZE_BYTES = 64;

static void * MyCustomAlloc(size_t userNumBytes)
{
    // We'll allocate space for a guard-band, then space to store the user's allocation-size-value,
    // then space for the user's actual data bytes, then finally space for a second guard-band at the end.
    char * buf = (char *) malloc(GUARD_BAND_SIZE_BYTES+sizeof(userNumBytes)+userNumBytes+GUARD_BAND_SIZE_BYTES);
    if (buf)
    {
       char * w = buf;
       memset(w, 'B', GUARD_BAND_SIZE_BYTES);          w += GUARD_BAND_SIZE_BYTES;
       memcpy(w, &userNumBytes, sizeof(userNumBytes)); w += sizeof(userNumBytes);
       char * userRetVal = w;                          w += userNumBytes;
       memset(w, 'E', GUARD_BAND_SIZE_BYTES);          w += GUARD_BAND_SIZE_BYTES;
       return userRetVal;
    }
    else throw std::bad_alloc();
}

static void MyCustomDelete(void * p)
{
    if (p == NULL) return;   // since delete NULL is a safe no-op

    // Convert the user's pointer back to a pointer to the top of our header bytes
    char * internalCP = ((char *) p)-(GUARD_BAND_SIZE_BYTES+sizeof(size_t));

    char * cp = internalCP;
    for (int i=0; i<GUARD_BAND_SIZE_BYTES; i++)
    {
        if (*cp++ != 'B')
        {
            printf("CORRUPTION DETECTED at BEGIN GUARD BAND POSITION %i of allocation %p\n", i, p);
            abort();
        }
    }

    // At this point, (cp) should be pointing to the stored (userNumBytes) field
    size_t userNumBytes = *((const size_t *)cp);
    cp += sizeof(userNumBytes);  // skip past the user's data
    cp += userNumBytes;

    // At this point, (cp) should be pointing to the second guard band
    for (int i=0; i<GUARD_BAND_SIZE_BYTES; i++)
    {
        if (*cp++ != 'E')
        {
            printf("CORRUPTION DETECTED at END GUARD BAND POSITION %i of allocation %p\n", i, p);
            abort();
        }
    }

    // If we got here, no corruption was detected, so free the memory and carry on
    free(internalCP);
}

// override the global C++ new/delete operators to call our
// instrumented functions rather than their normal behavior
void * operator new(size_t s)    throw(std::bad_alloc)   {return MyCustomAlloc(s);}
void * operator new[](size_t s)  throw(std::bad_alloc)   {return MyCustomAlloc(s);}
void operator delete(void * p)   throw()                 {MyCustomDelete(p);}
void operator delete[](void * p) throw()                 {MyCustomDelete(p);}

...以上内容足以让您获得 Electric-Fence 风格的功能,因为如果在任何新/删除内存分配的开始或结束时有任何内容写入两个 64 字节保护带"中的任何一个,然后当分配被删除时,MyCustomDelete() 会注意到损坏并导致程序崩溃.

... the above will be enough to get you Electric-Fence style functionality, in that if anything writes into either of the two 64-byte "guard bands" at the beginning or end of any new/delete memory-allocation, then when the allocation is deleted, MyCustomDelete() will notice the corruption and crash the program.

如果这还不够好(例如,因为在删除发生时,自损坏以来已经发生了太多的事情,以至于很难判断是什么导致了损坏),您可以通过让 MyCustomAlloc() 添加分配的缓冲区来更进一步到分配的单例/全局双向链接列表中,并让 MyCustomDelete() 从同一个列表中删除它(如果您的程序是多线程的,请确保序列化这些操作!).这样做的好处是你可以添加另一个函数,例如CheckForHeapCorruption() 将遍历该链表并检查链表中每个分配的保护带,并报告它们中的任何一个是否已损坏.然后,您可以在整个代码中调用 CheckForHeapCorruption(),这样当发生堆损坏时,它会在下一次调用 CheckForHeapCorruption() 时被检测到,而不是稍后.最终,您会发现对 CheckForHeapCorruption() 的一次调用顺利通过,然后在几行之后对 CheckForHeapCorruption() 的下一次调用检测到损坏,此时您知道损坏是由两者之间执行的任何代码引起的两次调用 CheckForHeapCorruption(),然后您可以研究该特定代码以找出它做错了什么,和/或根据需要在该代码中添加更多对 CheckForHeapCorruption() 的调用.

If that's not good enough (e.g. because by the time the deletion occurs, so much has happened since the corruption that it's difficult to tell what caused the corruption), you can go even further by having MyCustomAlloc() add the allocated buffer into a singleton/global doubly-linked list of allocations, and have MyCustomDelete() remove it from that same list (make sure to serialize these operations if your program is multithreaded!). The advantage of doing that is that you can then add another function called e.g. CheckForHeapCorruption() that will iterate over that linked list and check the guard-bands of every allocation in the linked list, and report if any of them have been corrupted. Then you can sprinkle calls to CheckForHeapCorruption() throughout your code, so that when heap corruption occurs it will be detected at the next call to CheckForHeapCorruption() rather than some time later on. Eventually you will find that one call to CheckForHeapCorruption() passed with flying colors, and then the next call to CheckForHeapCorruption(), just a few lines later, detected corruption, at which point you know that the corruption was caused by whatever code executed between the two calls to CheckForHeapCorruption(), and you can then study that particular code to figure out what it's doing wrong, and/or add more calls to CheckForHeapCorruption() into that code as necessary.

重复直到错误变得明显.祝你好运!

Repeat until the bug becomes obvious. Good luck!

这篇关于找出堆内存损坏的位置的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆