如何在 C 中取消引用 NULL 指针不会使程序崩溃? [英] How can dereferencing a NULL pointer in C not crash a program?

查看:12
本文介绍了如何在 C 中取消引用 NULL 指针不会使程序崩溃?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我需要一位真正的 C 大师的帮助来分析我的代码中的崩溃.不是为了修复崩溃;我可以轻松修复它,但在此之前我想了解这种崩溃是如何发生的,因为这对我来说似乎完全不可能.

此崩溃仅发生在客户计算机上,我无法在本地重现它(因此我无法使用调试器单步执行代码),因为我无法获取此用户数据库的副本.我的公司也不允许我只更改代码中的几行并为该客户进行自定义构建(因此我无法添加一些 printf 行并让他再次运行代码),当然客户的构建没有调试符号.换句话说,我的调试能力非常有限.尽管如此,我还是可以确定崩溃并获得一些调试信息.但是,当我查看该信息然后查看代码时,我无法理解程序流如何到达有问题的行.代码应该在到达该行之前很久就崩溃了.我完全迷路了.

让我们从相关代码开始.代码很少:

//... 上面的代码被跳过,不相关 ...如果(数据 == NULL)返回 -1;信息=解析数据(数据);如果(信息== NULL)返回-1;/* 检查名称是否已正确  终止 */if (信息->kind.name->data[信息->kind.name->length] != '') {freeParsedData(信息);返回-1;}/* 复制名字 */realLength = 信息->kind.name->length + 1;*结果 = malloc(realLength);如果(*结果 == NULL){freeParsedData(信息);返回-1;}strlcpy(*result, (char *)information->kind.name->data, realLength);//... 跳过下面的代码,不相关 ...

已经是这样了.它在 strlcpy 中崩溃.我什至可以告诉你 strlcpy 是如何在运行时真正调用的.strlcpy 实际上是使用以下参数调用的:

strlcpy (0x341000, 0x0, 0x1);

知道这一点,strlcpy 崩溃的原因就很明显了.它试图从 NULL 指针中读取一个字符,这当然会崩溃.而且由于最后一个参数的值是1,所以原来的长度一定是0.我的代码这里显然有一个bug,它没有检查名称数据是否为NULL.我可以解决这个问题,没问题.

我的问题是:
这段代码如何首先到达 strlcpy?
为什么这段代码不会在 if 语句处崩溃?

我在我的机器上本地尝试过:

int main (整数 argc,字符 ** argv) {char * nullString = malloc(10);免费(空字符串);空字符串 = 空;if (nullString[0] != '') {printf("未终止
");退出(1);}printf("可以通过 if 子句
");字符 xxx[10];strlcpy(xxx, nullString, 1);返回0;}

这段代码永远不会通过 if 语句.它在 if 语句中崩溃,这绝对是意料之中的.

那么任何人都可以想到为什么如果 name->data 确实为 NULL,第一个代码可以通过该 if 语句而不会崩溃?这对我来说是完全神秘的.这似乎不是确定性的.

重要的额外信息:
两个注释之间的代码真的是完整,没有遗漏.此外,该应用程序是单线程,因此没有其他线程可以意外更改后台的任何内存.发生这种情况的平台是 PPC CPU(一个 G4,以防万一).如果有人想知道kind.",这是因为信息"包含一个名为kind"的联合",而 name 又是一个结构(kind 是一个联合,每个可能的联合值都是不同类型的结构);但这一切在这里都不重要.

我很感激这里的任何想法.如果这不仅仅是一个理论,我会更加感激,如果有一种方法可以验证这个理论是否真的适用于客户.

解决方案

我已经接受了正确的答案,但以防万一有人在 Google 上找到这个问题,下面是真实情况:

指针指向已经被释放的内存.释放内存不会使其全部为零或导致进程立即将其还给系统.因此,即使内存被错误地释放,它也包含正确的值.在执行if check"时,有问题的指针不为 NULL.

检查后我分配了一些新内存,调用 malloc.不确定 malloc 在这里究竟做了什么,但每次调用 malloc 或 free 都会对进程的虚拟地址空间的所有动态内存产生深远的影响.在 malloc 调用之后,指针实际上是 NULL.不知何故 malloc (或某些系统调用 malloc 使用)将指针本身所在的已释放内存归零(不是它指向的数据,指针本身位于动态内存中).将该内存归零,指针现在的值为 0x0,在我的系统上等于 NULL,当调用 strlcpy 时,它当然会崩溃.

因此,导致这种奇怪行为的真正错误位于我的代码中完全不同的位置.永远不要忘记:释放的内存会保持它的价值,但它会持续多久是你无法控制的.要检查您的应用程序是否存在访问已释放内存的内存错误,只需确保释放的内存在释放之前始终为零.在 OS X 中,您可以通过在运行时设置环境变量来做到这一点(无需重新编译任何东西).当然,这会大大减慢程序的速度,但您会更早发现这些错误.

解决方案

结构可能位于已free()'d 的内存中,或者堆已损坏.在这种情况下,malloc() 可能正在修改内存,认为它是空闲的.

您可以尝试在内存检查器下运行您的程序.valgrind 是一种支持 Mac OS X 的内存检查器,尽管它仅在 Intel 上支持 Mac OS X,而不在 PowerPC 上支持.p>

I need help of a real C guru to analyze a crash in my code. Not for fixing the crash; I can easily fix it, but before doing so I'd like to understand how this crash is even possible, as it seems totally impossible to me.

This crash only happens on a customer machine and I cannot reproduce it locally (so I cannot step through the code using a debugger), as I cannot obtain a copy of this user's database. My company also won't allow me to just change a few lines in the code and make a custom build for this customer (so I cannot add some printf lines and have him run the code again) and of course the customer has a build without debug symbols. In other words, my debbuging abilities are very limited. Nonetheless I could nail down the crash and get some debugging information. However when I look at that information and then at the code I cannot understand how the program flow could ever reach the line in question. The code should have crashed long before getting to that line. I'm totally lost here.

Let's start with the relevant code. It's very little code:

// ... code above skipped, not relevant ...

if (data == NULL) return -1;

information = parseData(data);

if (information == NULL) return -1;

/* Check if name has been correctly  terminated */
if (information->kind.name->data[information->kind.name->length] != '') {
    freeParsedData(information);
    return -1;
}

/* Copy the name */
realLength = information->kind.name->length + 1;
*result = malloc(realLength);
if (*result == NULL) {
    freeParsedData(information);
    return -1;
}
strlcpy(*result, (char *)information->kind.name->data, realLength);

// ... code below skipped, not relevant ...

That's already it. It crashes in strlcpy. I can tell you even how strlcpy is really called at runtime. strlcpy is actually called with the following paramaters:

strlcpy ( 0x341000, 0x0, 0x1 );

Knowing this it is rather obvious why strlcpy crashes. It tries to read one character from a NULL pointer and that will of course crash. And since the last parameter has a value of 1, the original length must have been 0. My code clearly has a bug here, it fails to check for the name data being NULL. I can fix this, no problem.

My question is:
How can this code ever get to the strlcpy in the first place?
Why does this code not crash at the if-statement?

I tried it locally on my machine:

int main (
    int argc,
    char ** argv
) {
    char * nullString = malloc(10);
    free(nullString);
    nullString = NULL;

    if (nullString[0] != '') {
        printf("Not terminated
");
        exit(1);
    }
    printf("Can get past the if-clause
");

    char xxx[10];
    strlcpy(xxx, nullString, 1);
    return 0;   
}

This code never gets passed the if statement. It crashes in the if statement and that is definitely expected.

So can anyone think of any reason why the first code can get passed that if-statement without crashing if name->data is really NULL? This is totally mysterious to me. It doesn't seem deterministic.

Important extra information:
The code between the two comments is really complete, nothing has been left out. Further the application is single threaded, so there is no other thread that could unexpectedly alter any memory in the background. The platform where this happens is a PPC CPU (a G4, in case that could play any role). And in case someone wonders about "kind.", this is because "information" contains a "union" named "kind" and name is a struct again (kind is a union, every possible union value is a different type of struct); but this all shouldn't really matter here.

I'm grateful for any idea here. I'm even more grateful if it's not just a theory, but if there is a way I can verify that this theory really holds true for the customer.

Solution

I accepted the right answer already, but just in case anyone finds this question on Google, here's what really happened:

The pointers were pointing to memory, that has already been freed. Freeing memory won't make it all zero or cause the process to give it back to the system at once. So even though the memory has been erroneously freed, it was containing the correct values. The pointer in question is not NULL at the time the "if check" is performed.

After that check I allocate some new memory, calling malloc. Not sure what exactly malloc does here, but every call to malloc or free can have far-reaching consequences to all dynamic memory of the virtual address space of a process. After the malloc call, the pointer is in fact NULL. Somehow malloc (or some system call malloc uses) zeros the already freed memory where the pointer itself is located (not the data it points to, the pointer itself is in dynamic memory). Zeroing that memory, the pointer now has a value of 0x0, which is equal to NULL on my system and when strlcpy is called, it will of course crash.

So the real bug causing this strange behavior was at a completely different location in my code. Never forget: Freed memory keeps it values, but it is beyond your control for how long. To check if your app has a memory bug of accessing already freed memory, just make sure the freed memory is always zeroed before it is freed. In OS X you can do this by setting an environment variable at runtime (no need to recompile anything). Of course this slows down the program quite a bit, but you will catch those bugs much earlier.

解决方案

It is possible that the structure is located in memory that has been free()'d, or the heap is corrupted. In that case, malloc() could be modifying the memory, thinking that it is free.

You might try running your program under a memory checker. One memory checker that supports Mac OS X is valgrind, although it supports Mac OS X only on Intel, not on PowerPC.

这篇关于如何在 C 中取消引用 NULL 指针不会使程序崩溃?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆