Linux:如何调试SIGSEGV?如何跟踪错误源? [英] Linux: How to debug a SIGSEGV? How do I trace the error source?

查看:1297
本文介绍了Linux:如何调试SIGSEGV?如何跟踪错误源?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的火狐从今天开始崩溃了。我没有改变任何系统或firefox配置。



我使用

strace -ff -o dumpfile.txt firefox

来跟踪问题。这不是一个很大的帮助。



我看到了segfault,在两个生成的进程转储中,
,但我如何可以跟踪他们的原因?



运行10秒钟后崩溃,
由strace生成22MB的数据。



这是一个输出的代码段,您可以在其中看到实际的SIGSEGV:

 
read(19, \372,1)= 1
gettimeofday({1245590019,542231},NULL)= 0
读取(3,\6\0 [Qmy\26\0 \\ 3\1\0\0Y\0\200\2\0\0\0\0\323\3A\0\323\3(\ 0/\\20\0\1\0,4096)= 32
read(3,0xf5c55058,4096)= -1 EAGAIN(资源暂时不可用)
gettimeofday({1245590019,542813 },{fd = 4,事件= POLLIN},{fd = 3,事件= POLLIN},{fd = 8,事件= POLLIN | POLLPRI},{fd = events = POLLIN | POLLPRI},{fd = 13,events = POLLIN | POLLPR I},{fd = 14,events = POL
read(3,0xf5c55058,4096)= -1 EAGAIN(资源暂时不可用)
gettimeofday({1245590019,543161},NULL)= 0
gettimeofday({1245590019,546672},NULL)= 0
gettimeofday({1245590019,546761},NULL)= 0
读取(3,0xf5c55058,4096)= -1 EAGAIN(资源暂时不可用)
gettimeofday({1245590019,546936},NULL)= 0
poll([{fd = 4,events = POLLIN},{fd = 3,events = POLLIN},{fd = 8,events = POLLIN | POLLPRI},{fd = 12,events = POLLIN | POLLPRI},{fd = 13,events = POLLIN | POLLPRI},{fd = 14,events = POL
poll([{fd =事件= POLLIN | POLLOUT}],1,4294967295)= 1([{fd = 3,revents = POLLOUT}])
writev(3,[{5\30\4\0006\ 21\200\2\266\\\
\200\2\17\0] \3\230\4\5\0007\21\200\0026 \21\200\2\317\0\0\0...,1624},{NULL,0},{,0}],3)= 1624
轮询([{fd = 3,events = POLLIN}],1,4294967295)= 1([{fd = 3,revents = POLLIN}])
读(3,\1\30\224Q\17\17\0\0\0\0 \0\0\0\0\0\0000\235\273\0\0\0\0\00\264Q\0\0\ 0 \0 \0...,4096)= 4096
读取(3,\375\240f\0\376\242j\0\377\261\\ \\200\0\271a + \0\271a + \0\377\261\200\0\376\252w\0\376\250s\0.. $ 11
read(3,0xf5c55058,4096)= -1 EAGAIN(资源暂时不可用)
poll([{fd = 3,events = POLLIN | POLLOUT}],1,4294967295 )= 1([{fd = 3,revents = POLLOUT}])
writev(3,[{\230\32\7\0\1\21\200\ 2?\21\200\2\377\377\377\377\377\377\377\377\0\0\0\0\17 \0\1\0015\10\4\0...,956},{NULL,0},{,0}],3)= 95 6
poll([{fd = 3,events = POLLIN}],1,4294967295)= 1([{fd = 3,revents = POLLIN}])
read(3,\1 \30\256Q\17\17\0\0\0\0\0\0\0\0\0\0000\235\273\ 0 \0 \0 \0 \0o \264Q \0 \0 \0 \0 \0\\,4096)= 4096
读(3,\\ \\ 375\240f\0\376\242j\0\377\261\200\0\271a + \0\271a + \0\377\261\200 \0\376\252w\0\376\250s\0...,11356)= 11356
读(3,0xf5c55058,4096)= -1 EAGAIN(资源暂时不可用)
--- SIGSEGV(Segmentation fault)@ 0(0)---
unlink(/ home / userrrr / .mozilla / firefox / mvbnkitl.default / lock)= 0
rt_sigaction(SIGSEGV,{SIG_DFL,〜[HUP INT QUIT ABRT BUS FPE KILL PIPE CHLD CONT TTOU URG XCPU WINCH RT_1 RT_2 RT_3 RT_4 RT_8 RT_11 RT_14 RT_17 RT_22],SA_NOCLDSTOP},
rt_sigprocmask(SIG_BLOCK,〜[ILL ABRT B美国FPE SEGV RTMIN RT_1],〜[KILL STOP RTMIN RT_1],8)= 0
open(/ home / userrrr / .mozilla / firefox / mvbnkitl.default / minidumps / 56b30367-5ee2-0495-32646b7f- 59dc87e9.dmp,O_WRONLY | O_CREAT | O_EXCL,0600)= 63
clone(child_stack = 0xf5bfffe4,flags = CLONE_VM | CLONE_FS | CLONE_FILES | CLONE_UNTRACED)= 18929
waitpid(18929,NULL,__WALL)= 18929
open(/ proc / 18913 / task,O_RDONLY | O_NONBLOCK | O_LARGEFILE | O_DIRECTORY | O_CLOEXEC)= 64
fstat64(64,{st_mode = S_IFDIR | 0555,st_size = 0,... })= 0
getdents64(64,/ * 12条目* /,1024)= 368
ptrace(PTRACE_DETACH,18913,0,SIG_0)= -1 ESRCH(没有这样的过程)
关闭(64)= 0
ftruncate(63,91256)= 0
close(63)= 0
rt_sigprocmask(SIG_SETMASK,〜[KILL STOP RTMIN RT_1],〜[KILL STOP RTMIN RT_1 ],8)= 0
time(NULL)= 1245590020
open(/ home / userrrr / .mozilla / firefox / Crash Reports / LastCrash,O_WRONLY | O_CREAT | O_ TRUNC,0600)= 63
写(63,1245590020,10)= 10


解决方案>

Ivan,你真正的问题是如何调试SIGSEGV?



strace 在这里很少有一个很好的帮助。 SIGSEGV意味着应用程序试图取消引用(访问)内存中尚未分配的位置(或由于各种其他原因而不允许将其取消引用)。机会很高,它与Strace正在捕获的系统调用活动无关。为了发现你的崩溃的原因,首先要了解什么地址被取消引用,什么功能尝试这样做。调试器是此任务的正确工具。



您需要执行以下操作:

 code> gdb< your_app_name> < your_coredump_file> 

在那里,分析最后执行的指令和使用信息寄存器,你会看到地址题。使用bt命令可以看到callstack。通过行走callstack,您将发现如何计算错误的地址。这个地址计算涉及的步骤之一是您的问题的原因。



调试很有趣,这是一个很好的机会。一本好书或一些在线文章可以帮助你。 Google离开,祝你好运!


My firefox started crashing since today. I haven't changed anything on the system or on firefox config.

I use
strace -ff -o dumpfile.txt firefox
to trace the problem. It's not a big help.

I see the segfault, in two of the generated process dumps, but how I can trace them to their cause?

After running for 10 seconds and crashing, 22MB of data is generated by strace.

This is a snippet of the output, where you can see actual SIGSEGV in the middle.:

read(19, "\372", 1)                     = 1
gettimeofday({1245590019, 542231}, NULL) = 0
read(3, "\6\0[Qmy\26\0\3\1\0\0Y\0\200\2\0\0\0\0\323\3A\0\323\3(\0\20\0\1\0", 4096) = 32
read(3, 0xf5c55058, 4096)               = -1 EAGAIN (Resource temporarily unavailable)
gettimeofday({1245590019, 542813}, NULL) = 0
poll([{fd=4, events=POLLIN}, {fd=3, events=POLLIN}, {fd=8, events=POLLIN|POLLPRI}, {fd=12, events=POLLIN|POLLPRI}, {fd=13, events=POLLIN|POLLPRI}, {fd=14, events=POL
read(3, 0xf5c55058, 4096)               = -1 EAGAIN (Resource temporarily unavailable)
gettimeofday({1245590019, 543161}, NULL) = 0
gettimeofday({1245590019, 546672}, NULL) = 0
gettimeofday({1245590019, 546761}, NULL) = 0
read(3, 0xf5c55058, 4096)               = -1 EAGAIN (Resource temporarily unavailable)
gettimeofday({1245590019, 546936}, NULL) = 0
poll([{fd=4, events=POLLIN}, {fd=3, events=POLLIN}, {fd=8, events=POLLIN|POLLPRI}, {fd=12, events=POLLIN|POLLPRI}, {fd=13, events=POLLIN|POLLPRI}, {fd=14, events=POL
poll([{fd=3, events=POLLIN|POLLOUT}], 1, 4294967295) = 1 ([{fd=3, revents=POLLOUT}])
writev(3, [{"5\30\4\0006\21\200\2\266\n\200\2\17\0]\3\230\4\5\0007\21\200\0026\21\200\2\317\0\0\0"..., 1624}, {NULL, 0}, {"", 0}], 3) = 1624
poll([{fd=3, events=POLLIN}], 1, 4294967295) = 1 ([{fd=3, revents=POLLIN}])
read(3, "\1\30\224Q\17\17\0\0\0\0\0\0\0\0\0\0000\235\273\0\0\0\0\0o\264Q\0\0\0\0\0"..., 4096) = 4096
read(3, "\375\240f\0\376\242j\0\377\261\200\0\271a+\0\271a+\0\377\261\200\0\376\252w\0\376\250s\0"..., 11356) = 11356
read(3, 0xf5c55058, 4096)               = -1 EAGAIN (Resource temporarily unavailable)
poll([{fd=3, events=POLLIN|POLLOUT}], 1, 4294967295) = 1 ([{fd=3, revents=POLLOUT}])
writev(3, [{"\230\32\7\0\1\21\200\2?\21\200\2\377\377\377\377\377\377\377\377\0\0\0\0\17\0\1\0015\10\4\0"..., 956}, {NULL, 0}, {"", 0}], 3) = 956
poll([{fd=3, events=POLLIN}], 1, 4294967295) = 1 ([{fd=3, revents=POLLIN}])
read(3, "\1\30\256Q\17\17\0\0\0\0\0\0\0\0\0\0000\235\273\0\0\0\0\0o\264Q\0\0\0\0\0"..., 4096) = 4096
read(3, "\375\240f\0\376\242j\0\377\261\200\0\271a+\0\271a+\0\377\261\200\0\376\252w\0\376\250s\0"..., 11356) = 11356
read(3, 0xf5c55058, 4096)               = -1 EAGAIN (Resource temporarily unavailable)
--- SIGSEGV (Segmentation fault) @ 0 (0) ---
unlink("/home/userrrr/.mozilla/firefox/mvbnkitl.default/lock") = 0
rt_sigaction(SIGSEGV, {SIG_DFL, ~[HUP INT QUIT ABRT BUS FPE KILL PIPE CHLD CONT TTOU URG XCPU WINCH RT_1 RT_2 RT_3 RT_4 RT_8 RT_11 RT_14 RT_17 RT_22], SA_NOCLDSTOP},
rt_sigprocmask(SIG_BLOCK, ~[ILL ABRT BUS FPE SEGV RTMIN RT_1], ~[KILL STOP RTMIN RT_1], 8) = 0
open("/home/userrrr/.mozilla/firefox/mvbnkitl.default/minidumps/56b30367-5ee2-0495-32646b7f-59dc87e9.dmp", O_WRONLY|O_CREAT|O_EXCL, 0600) = 63
clone(child_stack=0xf5bfffe4, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_UNTRACED) = 18929
waitpid(18929, NULL, __WALL) = 18929
open("/proc/18913/task", O_RDONLY|O_NONBLOCK|O_LARGEFILE|O_DIRECTORY|O_CLOEXEC) = 64
fstat64(64, {st_mode=S_IFDIR|0555, st_size=0, ...}) = 0
getdents64(64, /* 12 entries */, 1024)  = 368
ptrace(PTRACE_DETACH, 18913, 0, SIG_0)  = -1 ESRCH (No such process)
close(64)                               = 0
ftruncate(63, 91256)                    = 0
close(63)                               = 0
rt_sigprocmask(SIG_SETMASK, ~[KILL STOP RTMIN RT_1], ~[KILL STOP RTMIN RT_1], 8) = 0
time(NULL)                              = 1245590020
open("/home/userrrr/.mozilla/firefox/Crash Reports/LastCrash", O_WRONLY|O_CREAT|O_TRUNC, 0600) = 63
write(63, "1245590020", 10)             = 10

解决方案

Ivan, your real question is "how do I debug a SIGSEGV?"

strace is rarely a good help here. SIGSEGV means that the application tried to dereference (access) a location in memory which which hasn't been allocated (or not allowed to be dereferenced for various other reasons). Chances are high that it is not related to the system calls activity which strace is capturing. In order to discover the cause of your crash, start by understanding what address is being dereferenced and what function tries to do that. Debugger is the right tool for this task.

Here's what you need to do:

 gdb <your_app_name> <your_coredump_file>

in there, analyzing the last executed instruction and using "info registers" you'll see the address in question. Using the "bt" command you'll see the callstack. By walking the callstack up, you'll discover how the incorrect address is being calculated. One of the steps involved in this address calculation is the cause of your problem.

Debugging is fun and this is a good opportunity to delve into it. A good book or some online articles can help you there. Google away and good luck!

这篇关于Linux:如何调试SIGSEGV?如何跟踪错误源?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆