什么使“不可能的”堆栈跟踪崩溃后? [英] What to make of an "impossible" stack trace after a crash?
问题描述
我的程序似乎遇到了一个非常难以再现的错误:一旦在一个蓝色的月亮,当用户把他的Mac睡觉,以后再次唤醒它,我的程序的子进程之一将崩溃
My program appears to be suffering from a very hard-to-reproduce bug: Once in a blue moon, when a user puts his Mac to sleep and later on wakes it back up again, one of my program's child processes will crash immediately after the Mac wakes up.
当发生这种情况时,Apple的崩溃记录机制可靠地报告类似这样的堆栈跟踪:
When this happens Apple's crash-reporter mechanism reliably reports a stack trace like this one:
Thread 0 Crashed:: Dispatch queue: com.apple.main-thread
0 libsystem_kernel.dylib 0x967f9a6a __pthread_kill + 10
1 libsystem_c.dylib 0x9003dacf pthread_kill + 101
2 libsystem_c.dylib 0x900744f8 abort + 168
3 com.meyersound.VirtualD-Mitri 0x0014438e muscle::CrashSignalHandler(int) + 190
4 libsystem_c.dylib 0x9002886b _sigtramp + 43
5 ??? 0xffffffff 0 + 4294967295
6 com.meyersound.VirtualD-Mitri 0x001442d0 muscle::ParsePortArg(muscle::Message const&, muscle::String const&, unsigned short&, unsigned long) + 80
7 com.meyersound.VirtualD-Mitri 0x005b3393 qnet::RepDBPeer::Pulse(muscle::PulseNode::PulseArgs const&) + 1187
8 com.meyersound.VirtualD-Mitri 0x0015717b muscle::PulseNode::PulseAux(unsigned long long) + 203
9 com.meyersound.VirtualD-Mitri 0x000cfb90 muscle::ReflectServer::ServerProcessLoop() + 3232
10 com.meyersound.VirtualD-Mitri 0x00607c7e dcasldmain(int, char**) + 2222
11 com.meyersound.VirtualD-Mitri 0x0072c14d dmitridmain(int, char**) + 4749
12 com.meyersound.VirtualD-Mitri 0x0000bc3a main + 4938
13 com.meyersound.VirtualD-Mitri 0x000061ab _start + 209
14 com.meyersound.VirtualD-Mitri 0x000060d9 start + 41
这一切都很好,除了(cue eerie音乐) - 这是逻辑上不可能的。特别是,我的 RepDBPeer :: Pulse()
方法不会调用 ParsePortArg
code> ParsePortArg 任何地方! (我grepp所有的源代码两次,以确保)
This is all well and good, except (cue eerie music) -- it's logically impossible. In particular, not only does my RepDBPeer::Pulse()
method never call ParsePortArg
, the crashing process never calls ParsePortArg
anywhere! (I grepped all of my source code twice to make sure)
所以我的问题是,这是堆栈跟踪试图告诉我什么?这是最可能的情况下,线程0的堆栈被损坏严重到足以使堆栈跟踪机制已经走了轨道,并指出一个无辜的旁观者作为罪魁祸首?还是有可能苹果的唤醒机制以某种方式跳跃程序计数器到ParsePortArg()(由此导致的混乱导致崩溃)?还是有其他更深的魔法在这里,我甚至不能想象?
So my question is, just what is this stack trace trying to tell me? Is this most likely a case of Thread 0's stack getting corrupted badly enough that the stack-trace mechanism has gone off the rails and fingered an innocent bystander as the culprit? Or is it possible that Apple's wakeup mechanism somehow "jumped" the program counter into ParsePortArg() (whereupon the resulting confusion caused the crash)? Or is there some other deeper magic going on here that I can't even imagine?
崩溃的进程是一个香草非GUI背景过程,是一个孩子
The crashing process in question is a vanilla non-GUI background process that is a child process spawned by a Qt GUI process, for what that's worth.
推荐答案
我想假设你有一定数量的优化打开。堆栈跟踪没有魔法。一旦代码内联或省略,它们变得越来越模糊(读为不太准确),这正是C ++优化器所做的。
在 ParsePortArg
的情况下,在该行的末尾有一个+80,意味着代码段中该函数的入口点之前的80个字节。这表示 0x001442d0
处的指令指针的真实地址, ParsePortArg
是堆栈转储猜测的最近的符号。
I'm going to assume you have some amount of optimization turned on. There's no magic to stack traces. They become increasingly more fuzzy (read as "less accurate") once code is inlined or omitted, which is precisely what a C++ optimizer does.
In the case of ParsePortArg
, there is a +80 at the end of that line, meaning 80 bytes ahead of the entry point of that function in the code segment. This indicates the true address of the instruction pointer at 0x001442d0
, and ParsePortArg
is the nearest symbol that the stack dump guessed at. You were right to assume it was a red herring.
除了任何其他数据,我会非常保守地猜测你的程序期望一些指针保持有效,从睡眠唤醒时无效。看看该地址处的指令的反汇编。我打赌一个内存地址正在尝试读取。
Barring any other data, I would make the very conservative guess that your program is expecting some pointer to remain valid that is not valid on wakeup from sleep. Look at the disassembly for the instruction at that address. I bet a memory address is trying to be read.
这篇关于什么使“不可能的”堆栈跟踪崩溃后?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!