gdb 如何为 C++ 重建堆栈跟踪? [英] How gdb reconstructs stacktrace for C++?

查看:32
本文介绍了gdb 如何为 C++ 重建堆栈跟踪?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我把整个问题分成了几个小问题:

I have divided the whole question into smaller ones:

  1. GDB 能够使用哪些不同的算法来重建堆栈跟踪?
  2. 每个堆栈跟踪重建算法在高层次上是如何工作的?优点和缺点?
  3. 每个堆栈跟踪重建算法需要在程序中提供什么样的元信息编译器才能工作?
  4. 还有启用/禁用特定算法的相应 g++ 编译器开关?

推荐答案

说起伪代码,你可以称栈为一个打包的栈帧数组",其中每个栈帧都是一个可变大小的数据结构,你可以表示为:

Speaking Pseudocode, you could call the stack "an array of packed stack frames", where every stack frame is a data structure of variable size you could express like:

template struct stackframe<N> {
    uintptr_t contents[N];
#ifndef OMIT_FRAME_POINTER
    struct stackframe<> *nextfp;
#endif
    void *retaddr;
};

问题是每个函数都有不同的<N> - 帧大小各不相同.

Problem is that every function has a different <N> - frame sizes vary.

编译器知道帧大小,如果创建调试信息,通常会将这些作为其中的一部分发出.然后调试器需要做的就是定位最后一个程序计数器,在符号表中查找函数,然后使用该名称在调试信息中查找帧大小.将其添加到堆栈指针中,您将到达下一帧的开头.

The compiler knows frame sizes, and if creating debugging information will usually emit these as part of that. All the debugger then needs to do is to locate the last program counter, look up the function in the symbol table, then use that name to look up the framesize in the debugging information. Add that to the stackpointer and you get to the beginning of the next frame.

如果使用此方法,您不需要帧链接,即使您使用 -fomit-frame-pointer,回溯也可以正常工作.另一方面,如果你有帧链接,那么迭代堆栈只是跟随一个链表 - 因为新堆栈帧中的每个帧指针都由函数序言代码初始化以指向前一个.

If using this method you don't require frame linkage, and backtracing will work just fine even if you use -fomit-frame-pointer. On the other hand, if you have frame linkage, then iterating the stack is just following a linked list - because every framepointer in a new stackframe is initialized by the function prologue code to point to the previous one.

如果您既没有帧大小信息也没有帧指针,但仍然是符号表,那么您还可以通过一些逆向工程执行回溯,以根据实际二进制计算帧大小.从程序计数器开始,在符号表中查找它所属的函数,然后从头开始反汇编函数.隔离函数开头和实际修改堆栈指针的程序计数器之间的所有操作(将任何内容写入堆栈和/或分配堆栈空间).这会计算当前函数的帧大小,因此从堆栈指针中减去它,并且您应该(在大多数架构上)找到在输入函数之前写入堆栈的最后一个字 - 这通常是调用者的返回地址.根据需要重新迭代.

If you have neither frame size information nor framepointers, but still a symbol table, then you can also perform backtracing by a bit of reverse engineering to calculate the framesizes from the actual binary. Start with the program counter, look up the function it belongs to in the symbol table, and then disassemble the function from the start. Isolate all operations between the beginning of the function and the program counter that actually modify the stackpointer (write anything to the stack and/or allocate stackspace). That calculates the frame size for the current function, so subtract that from the stackpointer, and you should (on most architectures) find the last word written to the stack before the function was entered - which is usually the return address into the caller. Re-iterate as necessary.

最后,您可以对堆栈的内容进行启发式分析 - 隔离堆栈中位于进程地址空间的可执行映射段内的所有字(因此可能是函数偏移量,也就是返回地址),然后播放一个查找内存的假设游戏,在那里反汇编指令,看看它是否真的是一个调用指令,如果是,那是否真的调用了下一个",以及你是否可以从中构造一个不间断的调用序列.即使二进制文件被完全剥离,这在一定程度上也有效(尽管在这种情况下你所能得到的只是返回地址列表).我不认为 GDB 采用这种技术,但一些嵌入式低级调试器会.在 x86 上,由于指令长度不同,这非常难以做到,因为你不能轻易地后退".通过指令流,但在 RISC 上,指令长度是固定的,例如在 ARM 上,这要简单得多.

Finally, you can perform a heuristic analysis of the contents of the stack - isolate all words in the stack that are within executably-mapped segments of the process address space (and thereby could be function offsets aka return addresses), and play a what-if game looking up the memory, disassembling the instruction there and see if it actually is a call instruction of sort, if so whether that really called the 'next' and if you can construct an uninterrupted call sequence from that. This works to a degree even if the binary is completely stripped (although all you could get in that case is a list of return addresses). I don't think GDB employs this technique, but some embedded lowlevel debuggers do. On x86, due to the varying instruction lengths, this is terribly difficult to do because you can't easily "step back" through an instruction stream, but on RISC, where instruction lengths are fixed, e.g. on ARM, this is much simpler.

有些漏洞有时会导致这些算法的简单甚至复杂/详尽的实现失败,例如尾递归函数、内联代码等.gdb 源代码可能会给你更多的想法:

There are some holes that make simple or even complex/exhaustive implementations of these algorithms fall over sometimes, like tail-recursive functions, inlined code, and so on. The gdb sourcecode might give you some more ideas:

https://sourceware.org/git/?p=binutils-gdb.git;a=blob;f=gdb/frame.c

GDB 采用了多种此类技术.

GDB employs a variety of such techniques.

这篇关于gdb 如何为 C++ 重建堆栈跟踪?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆