为什么解引用指向字符串(char数组)的指针会返回整个字符串而不是第一个字符? [英] Why does dereferencing a pointer to string (char array) returns the whole string instead of the first character?

查看:70
本文介绍了为什么解引用指向字符串(char数组)的指针会返回整个字符串而不是第一个字符?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

由于指向数组的指针指向数组的第一个元素(具有相同的地址),所以我不明白为什么会这样:

Since the pointer to array points to the first element of the array (having the same address), I don't understand why this happens:

#include <stdio.h>

int main(void) {    
    char (*t)[] = {"test text"};
    printf("%s\n", *t + 1); // prints "est text"
}

此外,为什么以下代码打印 2 然后呢?

Additionally, why does the following code print 2 then?

#include <stdio.h>

int main(void) {    
    char (*t)[] = {1, 2, 3, 4, 5};
    printf("%d\n", *t + 1); // prints "2"
}


推荐答案

撰写本文时,所有其他答案均不正确。此外,您的问题闻起来像 XY问题,因为您正在尝试构造可能不是您想要的。您真正想要做的只是:

All other answers at the moment of writing this answer were incorrect. Moreover your question smells like an an XY problem in that the construct you were trying most probably wasn't what you wanted. What you'd really want to do is simply:

char *t = "test text";
printf("%s\n", t);  // prints "test text"

printf("%c\n", t[1]); // prints "e", the 2nd character in the string.






但是由于您想了解为什么会发生这些事情,所有其他解释都是错误的,这里是:


But since you wanted to understand why those things happen, and all the other explanations were wrong, here goes:

您的声明将 t 声明为指向数组的指针字符:

Your declaration declares t as a pointer to an array of char:

cdecl> explain char (*t)[];
declare t as pointer to array of char

不是指针数组此外, * t 的类型不完整,因此无法确定其大小:

not an array of pointers as others have suggested. Furthermore, the type of *t is incomplete, so you cannot take its size:

sizeof *t;

将导致

error: invalid application of ‘sizeof’ to incomplete type ‘char[]’
     sizeof *t;

在编译时。

现在,当您尝试使用

 char (*t)[] = {"test text"};

它会发出警告,因为当测试文本 是(常量) char 数组,在这里它会衰减为 char 。另外,括号没有用。上面的摘录等于写:

it will warn because while "test text" is a array of (constant) char, here it decays to a pointer to char. Additionally, the braces there are useless; the excerpt above is equal to writing:

char (*t)[] = "test text";

没什么

int a = 42;

int a = {42};

是同义词。这是C。

要获取指向数组的指针,必须在数组上使用 address-of运算符(字符串文字!),以避免其衰减为指针:

To get a pointer to array, you must use "address-of" operator on the array (the string literal!), to avoid it decaying to a pointer:

char (*t)[] = &"test text";

现在 t 已正确初始化为指向 char 的(不可变)数组的指针。但是,在您的情况下,使用指向错误类型的指针并不重要,因为这两个指针尽管类型不兼容,但指向相同的地址-仅指向一个字符数组,而另一个指向第一个字符在char数组中;因此,观察到的行为是相同的。

Now t is a properly initialized as a pointer to an (immutable) array of char. However in your case using a pointer to incorrect type didn't matter because the 2 pointers, despite being of incompatible type, pointed to the equally same address - only, one pointed to array-of-char, and the other to the first character in that array of char; and thus the observed behaviour was identical.

当您取消引用 t ,它是指向 char 的指针,您将获得一个 char数组的定位器值(左值)。字符数组的左值在正常情况下会像通常那样衰减为指向第一个元素的指针,因此 * t + 1 将现在指向该数组中的第二个字符;然后 printf 将该值打印将从指针开始的以0结尾的字符串的内容

When you dereference t, which was pointer-to-array-of-char, you will get an locator value (lvalue) of array-of-char. An lvalue of array-of-char will then under normal circumstances decay to a pointer-to-the-first-element, as they usually do, so *t + 1 will now point to the second character in that array; and printfing that value will then print the contents of a 0-terminated string starting from that pointer.

在C11(n1570)中将%s 的行为指定为

The behaviour of %s is specified in C11 (n1570) as


[<< c $ c>%s ]

如果没有 l 长度修饰符存在,该参数应为指向字符类型数组的初始
元素的指针。 数组中的字符是
,写到(但不包括)终止空字符。
[...] 如果未指定
精度或大于数组的大小,因此
数组应包含一个空字符。 [...]

If no l length modifier is present, the argument shall be a pointer to the initial element of an array of character type. Characters from the array are written up to (but not including) the terminating null character. [...] If the precision is not specified or is greater than the size of the array, the array shall contain a null character. [...]

(强调我的意思)

第二次初始化:

char (*t2)[] = {1, 2, 3, 4, 5};

如果使用最新版本的GCC进行编译,则默认情况下会收到很多警告,首先: / p>

if you compile this with a recent version GCC you will get lots of warnings by default, first:

test.c:10:19: warning: initialization makes pointer from integer without a cast [-Wint-conversion]
   char (*t2)[] = {1, 2, 3, 4, 5};
                   ^

因此 1 是从 int 转换为 char 的数组指针,无需任何强制转换。

Thus 1 is converted from int to a pointer-to-array-of-char without any cast.

然后,在其余值中,编译器将抱怨:

Then, of the remaining values, the compiler will complain:

y.c:10:19: note: (near initialization for ‘t2’)
y.c:10:21: warning: excess elements in scalar initializer
   char (*t2)[] = {1, 2, 3, 4, 5};
                      ^

也就是说,在您的情况下,2、3、4和5是无声的

That is, in your case the 2, 3, 4 and 5 were silently ignored.

该指针的值现在为1,例如在x86平面内存模型上,它将指向内存位置1(尽管这自然是实现定义的):

The value of that pointer is thus now 1, e.g. on an x86 flat memory model it would point to memory location 1 (though this is naturally implementation defined):

printf("%p\n", (void*)t2);

打印(定义了双重实现)

prints (doubly implementation defined)

0x1

当您取消引用此值时(该指针为- to-array-of-char),您将获得一个从内存地址1开始的char-of-char的左值。当您添加1时,此 array-of-char 左值将衰减为指向字符的指针,结果将得到((char *)1)+ 1 ,这是指向 char 的值为 2 。可以从GCC(5.4.0)默认生成的警告中验证该值的类型:

When you dereference this value (which is a pointer-to-array-of-char), you will get an lvalue for array-of-char that starts at memory address 1. When you add 1, this array-of-char lvalue will decay to a pointer-to-char, and as a result you will get ((char*)1) + 1 which is a pointer-to-char whose value is 2. The type of that value can be verified from the warning generated by default by GCC (5.4.0):

y.c:5:10: warning: format ‘%d’ expects argument of type ‘int’, but argument 2 has type ‘char *’ [-Wformat=]
   printf("%d\n",*t2+1); //prints "2"
          ^

该参数的类型为 char *

现在您传递了(char *)2 作为 printf 的参数,将使用%d 进行转换,该转换预期为 int 。这具有不确定的行为;在您的情况下,(char *)2 的字节模式被充分混淆地解释为 2 并因此被打印。

Now you pass a (char*)2 as an argument to printf, to be converted using %d, which expects an int. This has undefined behaviour; in your case the byte pattern of (char*)2 is sufficiently confusingly interpreted as 2 and thus it is printed.

现在,人们意识到打印的值与原始初始值设定项中的 2 没有关系

And now one realizes that the value printed has nothing to do with 2 in the original initializer:

#include <stdio.h>

int main(void) {
    char (*t2)[] = {1, 42};
    printf("%d\n", *t2 + 1);
}

仍会打印 2 ,而不是 42

另外,对于两种初始化,您都可以使用C99复合文字进行初始化:

Alternatively for both initializations you could have used the C99 compound literals to initialize:

// Warning: this code is super *evil*
char (*t)[] = &(char []) { "test text" };
char (*t2)[] = &(char []) { 1, 2, 3, 4, 5 };

尽管这可能甚至 less 您想要的,以及生成的代码没有机会在C89或C ++编译器中进行编译。

Though this would probably be even less that which you wanted, and the resulting code does not have any chance of compiling in C89 or C++ compilers.

这篇关于为什么解引用指向字符串(char数组)的指针会返回整个字符串而不是第一个字符?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆