在C中调试受破坏的静态变量(gdb损坏了吗?) [英] Debugging a clobbered static variable in C (gdb broken?)
问题描述
我已经做了很多编程工作,但是用C语言却做的很少,我需要调试方面的建议。我有一个静态变量(文件作用域),在执行多线程程序大约10-100秒后(在OS X 10.4上使用pthreads),该变量被破坏了。我的代码如下所示:
I've done a lot of programming but not much in C, and I need advice on debugging. I have a static variable (file scope) that is being clobbered after about 10-100 seconds of execution of a multithreaded program (using pthreads on OS X 10.4). My code looks something like this:
static float some_values[SIZE];
static int * addr;
addr
指向a的有效内存地址while,然后被某个值(有时为0,有时为非零)所破坏,从而在取消引用时导致段错误。随便看看 gdb
我已验证 addr
在 some_values之后立即被布置在内存中
就像人们期望的那样,所以我的第一个猜测是我使用了越界索引来写入 some_values
。但是,这是一个很小的文件,因此很容易检查这不是问题。
addr
points to valid memory address for a while, and then gets clobbered with some value (sometimes 0, sometimes nonzero), thereby causing a segfault when dereferenced. Poking around with gdb
I have verified that addr
is being layed out in memory immediately after some_values
as one would expect, so my first guess would be that I have used an out-of-bounds index to write to some_values
. However, this is a tiny file, so it is easy to check this is not the problem.
显而易见的调试技术是在变量<$上设置观察点c $ c> addr 。但是这样做似乎在 gdb
中造成了不稳定和莫名其妙的行为。在第一次分配给 addr
时触发监视点;然后,在继续执行之后,我立即在另一个线程中遇到了荒谬的段错误……据说是在程序的不同部分中访问静态变量的地址时出现段错误!但是然后 gdb
让我以交互方式读取和写入该内存地址。
The obvious debugging technique would be to set a watchpoint on the variable addr
. But doing so seems to create erratic and inexplicable behavior in gdb
. The watchpoint gets triggered at the first assignment to addr
; then after I continue execution, I immediately get a nonsensical segfault in another thread...supposedly a segfault on accessing the address of a static variable in a different part of the program! But then gdb
lets me read from and write to that memory address interactively.
Program received signal EXC_BAD_ACCESS, Could not access memory.
Reason: KERN_PROTECTION_FAILURE at address: 0x001d5bd0
0x0000678d in receive (arg=0x0) at mainloop.c:39
39 sample_buf_cleared ++;
(gdb) p &sample_buf_cleared
$17 = (int *) 0x1d5bd0
(gdb) p sample_buf_cleared
$18 = 1
(gdb) set sample_buf_cleared = 2
(gdb)
gdb
显然是困惑的。有人知道为什么吗?还是有人在不使用观察点的情况下调试此错误的任何建议?
gdb
is obviously confused. Does anyone know why? Or does anyone have any suggestions for debugging this bug without using watchpoints?
推荐答案
- 您可以放一个uint在some_values和addr之间的数组,并确定您是否正在超出some_values或损坏是否影响了更多地址,那么您首先想到的是。我会将填充初始化为DEADBEEF或其他易于区分且不太可能在程序中出现的其他明显模式。如果填充中的值发生变化,则将其强制转换为浮点数,然后查看数字是否有意义。
静态浮点some_values [尺寸];
静态无符号整数填充[1024];
static int * addr;
static float some_values[SIZE]; static unsigned int padding[1024]; static int * addr;
-
多次运行该程序。在每次运行中,禁用一个不同的线程,然后查看问题何时消失。
Run the program multiple times. In each run disable a different thread and see when the problems goes away.
设置程序对单个内核的进程亲和力,然后尝试观察点。如果没有两个线程同时修改该值,则可能会更好。注意:此解决方案不排除这种情况的发生。
Set the programs process affinity to a single core and then try the watchpoint. You may have better luck if you don't have two threads simultaneously modifying the value. NOTE: This solution does not preclude that from happening. It may make it easier to catch in a debugger.
这篇关于在C中调试受破坏的静态变量(gdb损坏了吗?)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!