远程验尸coredump分析,而不具有共享系统库的确切调试符号 [英] Remote Post-mortem coredump analysis without having exact debug symbols for shared system libraries

查看:264
本文介绍了远程验尸coredump分析,而不具有共享系统库的确切调试符号的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

你通常如何解决这个问题?想象一下,一个线程在Computer1上的libc代码(这是一个系统共享库)中崩溃,然后生成一个coredump。但是,将要分析此coredump的Computer2可能具有不同版本的libc。

所以:


  1. 远程计算机上的共享库?请问gdb是否正确地重建了堆栈跟踪而没有在Conputer2上具有完全相同的libc版本?

  2. 为libc提供正确的调试符号有多重要?请问gdb是否正确地重建堆栈跟踪,而没有在Computer2上完全相同的调试符号? 什么是避免这种调试符号不匹配问题的正确共享系统库?对我来说,似乎没有单一的解决方案能够以优雅的方式解决这个问题?也许任何人都可以分享他的经验? >

    这取决于。在某些处理器上,如 x86_64 ,更正展开描述符是GDB正确展开堆栈所必需的。在这样的机器上,使用不匹配的libc分析coredump可能会产生完整的垃圾。

  3. 您不需要libc的调试符号来获取堆栈跟踪。你不会得到没有调试符号的文件和行号,但你应该得到正确的函数名称(除非发生内联)。

  4. 你的问题是错误的 - 调试符号与此无关。在C1上生成coredump时,分析coredump的正确方法是获取C1库的副本(例如 /tmp/C1/lib/... libc

    (gdb)set solib-absolute-prefix / tmp / C1


命令。


$ b

注意:在将内核加载到GDB之前,上面的设置必须有效。这:

  gdb exe core 
(gdb)set solib-absolute-prefix / tmp / C1

将不起作用(核心在设置生效之前被读取)。

以下是正确的方法:

  gdb exe 
(gdb)set solib-absolute-prefix / tmp / C1
(gdb)核心核心

(我试着找到 b
$ b

什么是展开描述符?



当代码编译时没有帧指针时,需要展开描述符(优化模式下为x86_64的默认值)。这样的代码不会保存%rbp寄存器,因此需要告知GDB如何从当前帧退后到调用者帧(此过程也称为堆栈展开)。



为什么不把C1的libc.so包含在内核中?



核心文件通常只包含程序地址空间的可写段的内容。只读段(可执行代码和展开描述符所在的位置)通常不是必需的 - 您可以直接从磁盘上的libc.so中读取它们。



除此之外在C2上分析C1的核心时不起作用!



有些(但不是全部)操作系统允许配置完整核心,操作系统将转储读取 - 唯一的映射,所以你可以在任何机器上分析核心。


How do you usually get around this problem? Imagine that a thread crashes inside libc code (which is a system shared library) on Computer1 and then generates a coredump. But the Computer2 on which this coredump will be analysed might have a different version of libc.

So:

  1. How important it is to have the same shared library on the remote computer? Will the gdb correctly reconstruct stacktrace without having exact same version of libc on Conputer2?

  2. How important it is to have correct debug symbols for libc? Will the gdb correctly reconstruct stacktrace without having exact same debug symbols on the Computer2?

  3. And what is the "correct" way to avoid this debug symbol mismatch problem for shared system libraries? For me it seems that there is no single solution that solves this problem in an elegant way? Maybe anyone can share his experience?

解决方案

  1. It depends. On some processors, such as x86_64, correct unwind descriptors are required for GDB to properly unwind the stack. On such machine, analyzing coredump with non-matching libc will likely produce complete garbage.

  2. You don't need debug symbols for libc to get the stack trace. You wouldn't get file and line numbers without debug symbols, but you should get correct function names (except when inlining has taken place).

  3. The premise of your question is wrong -- debug symbols have nothing to do with this. The "correct" way to analyze coredump on C2, when that coredump was produced on C1, is to have a copy of C1's libraries (in e.g. /tmp/C1/lib/...) and direct GDB to use that copy instead of the C2's installed libc with

    (gdb) set solib-absolute-prefix /tmp/C1

command.

Note: above setting must be in effect before you load the core into GDB. This:

gdb exe core
(gdb) set solib-absolute-prefix /tmp/C1

will not work (core is read before the setting is in effect).

Here is the right way:

gdb exe
(gdb) set solib-absolute-prefix /tmp/C1
(gdb) core core

(I've tried to find a reference to this on the web, but didn't).

What are unwind descriptors?

Unwind descriptors are required when code is compiled without frame pointers (default for x86_64 in optimized mode). Such code does not save %rbp register, and so GDB needs to be told how to "step back" from current frame to the caller frame (this process is also known as stack unwinding).

Why isn't C1's libc.so included in the core?

The core file usually contains only contents of writable segments of the program address space. The read-only segments (where executable code and unwind descriptors reside) is not usually necessary -- you could just read them directly from libc.so on disk.

Except this doesn't work when you analyze C1's core on C2!

Some (but not all) operating systems allow one to configure "full coredumps", where the OS will dump read-only mappings as well, precisely so you can analyze core on any machine.

这篇关于远程验尸coredump分析,而不具有共享系统库的确切调试符号的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆