为什么不编译器检测并产生错误试图修改的char *字符串的时候? [英] Why doesn't the compiler detect and produce errors when attempting to modify char * string literals?

查看:135
本文介绍了为什么不编译器检测并产生错误试图修改的char *字符串的时候?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假定code的以下两个部分:

Assume the following two pieces of code:

char *c = "hello world";
c[1] = 'y';

的模式上面不起作用。

char c[] = "hello world";
c[1] = 'y';

这人做。

使用关于第一个,我明​​白字符串世界你好可以被存储在只读存储器部分,因此不能被改变。然而第二个创建堆栈上的字符阵列,因此可以修改

With regards to the first one, I understand that the string "hello world" might be stored in the read only memory section and hence can't be changed. The second one however creates a character array on the stack and hence can be modified.

我的问题是 - 为什么不编译器检测到的第一个类型的错误?为什么不是C标准的一部分?有一些特殊的原因?

推荐答案

C编译器不要求检测的第一个错误,因为C字符串文字是不是常量

C compilers are not required to detect the first error, because C string literals are not const.

参照 C99标准的N1256草案

6.4.5条第5款:

6.4.5 paragraph 5:

在翻译阶段7,零值字节或code追加到
  从一个字符串导致每个多字节字符序列
  或文字。然后该多字节字符序列被用于
  初始化静态存储时间和长度的数组只
  足以包含序列。对于字符字符串时,
  数组元素的类型为字符,并与被初始化
  多字节字符序列的单个字节; [...]

In translation phase 7, a byte or code of value zero is appended to each multibyte character sequence that results from a string literal or literals. The multibyte character sequence is then used to initialize an array of static storage duration and length just sufficient to contain the sequence. For character string literals, the array elements have type char, and are initialized with the individual bytes of the multibyte character sequence; [...]

第6段:

这是不确定的,这些阵列是否提供了不同的
  元素具有适当的值。如果程序试图
  修改这样的阵列,其行为是不确定的。

It is unspecified whether these arrays are distinct provided their elements have the appropriate values. If the program attempts to modify such an array, the behavior is undefined.

(C11不会改变这一点。)

(C11 does not change this.)

所以字符串你好,世界的类型为的char [13] 为const char [13] ),将其转化为的char * 在大多数情况下。

So the string literal "hello, world" is of type char[13] (not const char[13]), which is converted to char* in most contexts.

试图修改常量对象不确定的行为,最code,它试图这样做,必须由编译器进行诊断(让您可以与铸造,例如)。试图修改字符串也有不确定的行为,而不是因为它的常量(它不是);这是因为标准明确说这种行为是不确定的。

Attempting to modify a const object has undefined behavior, and most code that attempts to do so must be diagnosed by the compiler (you can get around that with a cast, for example). Attempting to modify a string literal also has undefined behavior, but not because it's const (it isn't); it's because the standard specifically says the behavior is undefined.

例如,这个程序是严格遵循规则:

For example, this program is strictly conforming:

#include <stdio.h>

void print_string(char *s) {
    printf("%s\n", s);
}

int main(void) {
    print_string("Hello, world");
    return 0;
}

如果字符串为常量,然后通过你好,世界来,需要一个(非功能 - 常量的char * 将需要一个诊断。该方案是有效的,但如果 print_string()试图修改字符串被指向它会表现出不确定的行为取值

If string literals were const, then passing "Hello, world" to a function that takes a (non-const) char* would require a diagnostic. The program is valid, but it would exhibit undefined behavior if print_string() attempted to modify the string pointed to by s.

原因是历史性的。 pre-ANSI C没有足够的常量关键字,所以没有办法定义一个函数,它接受一个的char * ,并承诺不修改它指向。制作字符串常量在ANSI C(1989)将打破现有的code,而且一直没有做在以后的版本中这种变化的好机会标准的。

The reason is historical. Pre-ANSI C didn't have the const keyword, so there was no way to define a function that takes a char* and promises not to modify what it points to. Making string literals const in ANSI C (1989) would have broken existing code, and there hasn't been a good opportunity to make such a change in later editions of the standard.

gcc的 -Wwrite串确实产生它来治疗字符串为常量,反而使得GCC非符合编译器,因为它没有发出诊断为这样:

gcc's -Wwrite-strings does cause it to treat string literals as const, but makes gcc a non-conforming compiler, since it fails to issue a diagnostic for this:

const char (*p)[6] = &"hello";

你好的类型为的char [6] ,所以&安培; 你好的类型为字符(*)[6] ,这与声明的类型不兼容 p 随着 -Wwrite串&安培;你好被视为是类型为const char(*)[6] )presumably这就是为什么既不 -Wall 也不 -Wextra 包括 -Wwrite串

("hello" is of type char[6], so &"hello" is of type char (*)[6], which is incompatible with the declared type of p. With -Wwrite-strings, &"hello" is treated as being of type const char (*)[6].) Presumably this is why neither -Wall nor -Wextra includes -Wwrite-strings.

在另一方面,code,触发器使用 -Wwrite串警告或许应该反正固定。这不是一个坏主意,写你的C code,因此无需编译使用和不使用 -Wwrite串

On the other hand, code that triggers a warning with -Wwrite-strings should probably be fixed anyway. It's not a bad idea to write your C code so it compiles without diagnostics both with and without -Wwrite-strings.

(注意,C ++字符串字面的常量,因为当Bjarne的Stroustrup的在设计C ++,他还不如关心旧严格的兼容性C $ C $角)

(Note that C++ string literals are const, because when Bjarne Stroustrup was designing C++ he wasn't as concerned about strict compatibility for old C code.)

这篇关于为什么不编译器检测并产生错误试图修改的char *字符串的时候?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆