如何解决 malloc 中的崩溃问题 [英] How to troubleshoot crashes in malloc

查看:37
本文介绍了如何解决 malloc 中的崩溃问题的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我继承了大量遗留代码.到目前为止,它一直运行良好.突然在我无法在内部重现的客户试用中,它在 malloc 中崩溃.我认为我需要添加工具,例如在 malloc 之上我有自己的 malloc 来存储有关每个 malloc 的一些元信息,例如谁进行了 malloc 调用.当它崩溃时,我可以查找元信息并查看发生了什么.几年前我做过类似的事情,但现在想不起来了……我相信人们已经想出了更好的主意.很高兴有意见.

I have a large body of legacy code that I inherited. It has worked fine until now. Suddenly at a customer trial that I cannot reproduce inhouse, it crashes in malloc. I think that I need to add instrumentation e.g on top of malloc I have my own malloc that stores some meta information about each malloc e.g. who has made the malloc call. When it crashes, I can then look up the meta information and see what was happening. I had done something similar years ago but cannot recall it now...I am sure people have come up with better ideas. Will be glad to have inputs.

谢谢

推荐答案

内存分配坏了吗?

试试 valgrind.

好的,我将不得不假设您的意思是 SIGSEGV(分段错误)在 malloc 中触发.这通常是由堆损坏引起的.堆损坏本身不会导致分段错误,通常是数组访问超出数组边界的结果.这通常离你调用 malloc 的地方很远.

Okay, I'm going to have to assume that you mean SIGSEGV (segmentation fault) is firing in malloc. This is usually caused by heap corruption. Heap corruption, that itself does not cause a segmentation fault, is usually the result of an array access outside of the array's bounds. This is usually nowhere near the point where you call malloc.

malloc 存储在前面"的一小段信息头.它返回给您的内存块.该信息通常包含块的大小和指向下一个块的指针.不用说,更改其中任何一个都会导致问题.通常,下一个块指针被更改为无效地址,并且下次调用 malloc 时,它最终会取消引用坏指针和分段错误.或者它没有并开始将随机内存解释为堆的一部分.最终它的运气用完了.

malloc stores a small header of information "in front of" the memory block that it returns to you. This information usually contains the size of the block and a pointer to the next block. Needless to say, changing either of these will cause problems. Usually, the next-block pointer is changed to an invalid address, and the next time malloc is called, it eventually dereferences the bad pointer and segmentation faults. Or it doesn't and starts interpreting random memory as part of the heap. Eventually its luck runs out.

请注意,如果正在释放的块或空闲块列表混乱,free 可能会发生同样的事情.

Note that free can have the same thing happen, if the block being released or the free block list is messed up.

如何捕获这种错误完全取决于您如何访问 malloc 返回的内存.单个 structmalloc 通常不是问题;它通常是数组的 malloc .使用负数(-1 或 -2)索引通常会为您提供当前块的块头,而通过数组末尾的索引可以为您提供下一个块的头.两者都是有效的内存位置,因此不会出现分段错误.

How you catch this kind of error depends entirely on how you access the memory that malloc returns. A malloc of a single struct usually isn't a problem; it's malloc of arrays that usually gets you. Using a negative (-1 or -2) index will usually give you the block header for your current block, and indexing past the array end can give you the header of the next block. Both are valid memory locations, so there will be no segmentation fault.

所以首先要尝试的是范围检查.您提到这出现在客户的网站上;可能是因为他们正在使用的数据集更大,或者输入数据已损坏(例如,它说分配 100 个元素,然后初始化 101),或者他们正在以不同的顺序执行操作(这隐藏了错误你的内部测试),或者做一些你没有测试过的事情.没有更多细节很难说.你应该考虑写一些东西来检查你的输入数据.

So the first thing to try is range checking. You mention that this appeared at the customer's site; maybe it's because the data set they are working with is much larger, or that the input data is corrupt (e.g. it says to allocate 100 elements and then initializes 101), or they are performing things in a different order (which hides the bug in your in-house testing), or doing something you haven't tested. It's hard to say without more specifics. You should consider writing something to sanity check your input data.

这篇关于如何解决 malloc 中的崩溃问题的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆