strtok的分段错误 [英] strtok segmentation fault
问题描述
我试图理解为什么code的片断是给分段错误:
I am trying to understand why the following snippet of code is giving a segmentation fault:
void tokenize(char* line)
{
char* cmd = strtok(line," ");
while (cmd != NULL)
{
printf ("%s\n",cmd);
cmd = strtok(NULL, " ");
}
}
int main(void)
{
tokenize("this is a test");
}
我知道的strtok()实际上并不记号化的字符串,但在这种情况下,行
直接指向字符串这是一个测试
这是内部字符
的数组。是否有任何标记化的行
没有将其复制到一个数组中?
I know that strtok() does not actually tokenize on string literals, but in this case, line
points directly to the string "this is a test"
which is internally an array of char
. Is there any of tokenizing line
without copying it into an array?
推荐答案
的问题是,你正在试图修改字符串。这样做会导致你的程序的行为是不确定的。
The problem is that you're attempting to modify a string literal. Doing so causes your program's behavior to be undefined.
说,你不允许修改字符串是过于简单化。说字符串是常量
是不正确;他们不是。
Saying that you're not allowed to modify a string literal is an oversimplification. Saying that string literals are const
is incorrect; they're not.
警告:题外话如下
的字符串这是一个测试
是类型的前pression的的char [15]
(14的长度,再加上1终止'\\ 0'
)。在大多数情况下,包括这一个,这样的前pression被隐式转换为指针数组类型的的char *
的第一个元素。
The string literal "this is a test"
is of an expression of type char[15]
(14 for the length, plus 1 for the terminating '\0'
). In most contexts, including this one, such an expression is implicitly converted to a pointer to the first element of the array, of type char*
.
试图修改一个字符串文字中提到的阵列的行为是不确定的 - 不是因为它的常量
(它不是),而是因为C标准具体说,这是不确定的。
The behavior of attempting to modify the array referred to by a string literal is undefined -- not because it's const
(it isn't), but because the C standard specifically says that it's undefined.
有些编译器可能允许您脱身与此有关。您code实际上可能会修改相应的文字(这样会造成很大的混乱以后)的静态数组。
Some compilers might permit you to get away with this. Your code might actually modify the static array corresponding to the literal (which could cause great confusion later on).
大多数现代编译器,虽然,将存储在只读存储器阵列 - 不是物理ROM中,但在是由虚拟存储器系统保护修改的内存区域。试图修改这种存储器的结果通常是分段错误和程序崩溃。
Most modern compilers, though, will store the array in read-only memory -- not physical ROM, but in a region of memory that's protected from modification by the virtual memory system. The result of attempting to modify such memory is typically a segmentation fault and a program crash.
那么,为什么的不的字符串常量
?既然你真的不应该试图修改它们,这肯定会是有意义 - 和C ++确实让字符串常量
。原因是历史性的。它是由1989年的ANSI C标准出台之前的常量
关键字不存在(虽然它由一些编译器在这之前可能实现)。因此,一个pre-ANSI程序可能是这样的:
So why aren't string literals const
? Since you really shouldn't try to modify them, it would certainly make sense -- and C++ does make string literals const
. The reason is historical. The const
keyword didn't exist before it was introduced by the 1989 ANSI C standard (though it was probably implemented by some compilers before that). So a pre-ANSI program might look like this:
#include <stdio.h>
print_string(s)
char *s;
{
printf("%s\n", s);
}
main()
{
print_string("Hello, world");
}
有没有办法强制执行 print_string
不允许修改字符串指向的事实小号
。制作字符串常量
在ANSI C将会破坏现有的code,它的ANSI C委员会极力避免这样做。没有发生过从那时起一个很好的机会来做出这样的改变的语言。 (C ++,主要是Bjarne的Stroustrup的,的设计师们并不像担心用C的向后兼容性。)
There was no way to enforce the fact that print_string
isn't allowed to modify the string pointed to by s
. Making string literals const
in ANSI C would have broken existing code, which the ANSI C committee tried very hard to avoid doing. There hasn't been a good opportunity since then to make such a change to the language. (The designers of C++, mostly Bjarne Stroustrup, weren't as concerned about backward compatibility with C.)
这篇关于strtok的分段错误的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!