为什么箭头( - >)经营者在存在C? [英] Why does the arrow (->) operator in C exist?

查看:120
本文介绍了为什么箭头( - >)经营者在存在C?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

点( 。)运算符是用来访问结构的成员,而箭头操作符( - > )被用来访问它是由有问题的指针引用结构体的成员。

The dot (.) operator is used to access a member of a struct, while the arrow operator (->) in C is used to access a member of a struct which is referenced by the pointer in question.

的指针本身不具有可能与点运算符访问的任何部件(它实际上只在虚拟存储器描述的位置,因此不会有任何成员数)。所以,就不会有,如果我们刚才定义的点操作,自动取消引用指针,如果它是在一个指针(这是众所周知的在AFAIK编译时,编译器的信息)使用任何含糊。

The pointer itself does not have any members which could be accessed with the dot operator (it's actually only a number describing a location in virtual memory so it doesn't have any members). So, there would be no ambiguity if we just defined the dot operator to automatically dereference the pointer if it is used on a pointer (an information which is known to the compiler at compile time afaik).

那么,为什么有语言的创造者决定将加入这个看似不必要的运营商事情变得更加复杂?什么是大的设计决策?

So why have the language creators decided to make things more complicated by adding this seemingly unnecessary operator? What is the big design decision?

推荐答案

我会间preT你两个问题的问题:1)为什么 - > 即使存在,以及2)为什么不会自动取消引用指针。回答这两个问题有历史根源的。

I'll interpret your question as two questions: 1) why -> even exists, and 2) why . does not automatically dereference the pointer. Answers to both questions have historical roots.

为什么 - > 甚至还存在

Why does -> even exist?

在C语言的第一版本(我将作为参考的CRM的 C参考手册,这与第6版的Unix排在1975年5月),操作员 - > 有非常独特的意义,有没有代名词 * 组合

In one of the very first versions of C language (which I will refer as CRM for "C Reference Manual", which came with 6th Edition Unix in May 1975), operator -> had very exclusive meaning, not synonymous with * and . combination

通过CRM所描述的C语言是从现代的C在许多方面非常不同。在CRM结构成员实施的字节偏移的,至极可以被添加到没有类型限制的任何地址值的全局观念。即所有结构成员的所有名字有独立的全球意义(因此,必须是唯一的)。例如,你可以声明

The C language described by CRM was very different from the modern C in many respects. In CRM struct members implemented the global concept of byte offset, wich could be added to any address value with no type restrictions. I.e. all names of all struct members had independent global meaning (and, therefore, had to be unique). For example you could declare

struct S {
  int a;
  int b;
};

和名称 A 将代表偏移量为0,而名称 B 将代表偏移2(假设 INT 2的大小类型,没有填充)。语言要求的所有结构的所有成员在翻译单元要么有独特的名称,或代表了相同的偏移值。例如。在相同的翻译单元,你可以另行申报

and name a would stand for offset 0, while name b would stand for offset 2 (assuming int type of size 2 and no padding). The language required all members of all structs in the translation unit either have unique names or stand for the same offset value. E.g. in the same translation unit you could additionally declare

struct X {
  int a;
  int x;
};

这将是确定的,因为名称 A 将始终如一地站在偏移0但这附加声明

and that would be OK, since the name a would consistently stand for offset 0. But this additional declaration

struct Y {
  int b;
  int a;
};

将正式失效,因为它试图重新定义 A 作为偏移2和 B 的偏移量为0

would be formally invalid, since it attempted to "redefine" a as offset 2 and b as offset 0.

这是其中 - > 运营商进来由于每个结构成员的名字有其自身的自给自足的世界意义,语言支持的前pressions。像这样的

And this is where the -> operator comes in. Since every struct member name had its own self-sufficient global meaning, the language supported expressions like these

int i = 5;
i->b = 42;  /* Write 42 into `int` at address 7 */
100->a = 0; /* Write 0 into `int` at address 100 */

第一个任务是间$ P $编译器PTED为取地址 5 ,加偏移 2 它并分配 42 来的最终地址 INT 的价值。即上面会分配 42 INT 地址值 7 。请注意,这个使用的 - > 没在意恩pression对左侧的类型。左侧是PTED为右值的数字地址间$ P $(可能是一个指针或一个整数)。

The first assignment was interpreted by the compiler as "take address 5, add offset 2 to it and assign 42 to the int value at the resultant address". I.e. the above would assign 42 to int value at address 7. Note that this use of -> did not care about the type of the expression on the left-hand side. The left hand side was interpreted as an rvalue numerical address (be it a pointer or an integer).

这有点挂羊头卖狗肉是不可能的 * 组合。你不能这样做。

This sort of trickery was not possible with * and . combination. You could not do

(*i).b = 42;

因为 * I 已经是一个无效的前pression。在 * 运营商,因为它是从分开的。,规定了它的操作更严格的类型要求。为了提供解决此限制CRM引入了一种能力 - 方式> 操作符,它是独立于左侧操作数的类型

since *i is already an invalid expression. The * operator, since it is separate from ., imposes more strict type requirements on its operand. To provide a capability to work around this limitation CRM introduced the -> operator, which is independent from the type of the left-hand operand.

由于基斯在评论中指出的,之间是有区别 - > * + 组合是什么样的CRM是指作为的要求放宽的7.1.8:除了那 E1 指针类型的,前pression E1-> MOS 完全等同于(* E1).MOS

As Keith noted in the comments, this difference between -> and *+. combination is what CRM is referring to as "relaxation of the requirement" in 7.1.8: Except for the relaxation of the requirement that E1 be of pointer type, the expression E1−>MOS is exactly equivalent to (*E1).MOS

后来,在K&放大器;最初在CRM描述R C的许多功能都显著返工。 结构成员的全局偏移标识符的想法是完全去除。和的功能 - > 运营商成为了 * 的功能完全一致。 组合。

Later, in K&R C many features originally described in CRM were significantly reworked. The idea of "struct member as global offset identifier" was completely removed. And the functionality of -> operator became fully identical to the functionality of * and . combination.

为什么不能自动取消引用指针?

Why can't . dereference the pointer automatically?

此外,在语言的CRM版本的的左操作数。运营商被要求成为一个的左值的。那是的只有的征收操作数的要求(这就是使它从不同 - > ,如上所述)。需要注意的是CRM做的的要求的左操作数。有一个结构类型。它只是要求它是一个左值,的左值。这意味着,在C CRM版本,你可以写code这样的

Again, in CRM version of the language the left operand of the . operator was required to be an lvalue. That was the only requirement imposed on that operand (and that's what made it different from ->, as explained above). Note that CRM did not require the left operand of . to have a struct type. It just required it to be an lvalue, any lvalue. This means that in CRM version of C you could write code like this

struct S { int a, b; };
struct T { float x, y, z; };

struct T c;
c.b = 55;

在这种情况下,编译器会写 55 进入位于一个 INT 值的字节偏移量在连续2被称为 C 内存块,即使键入结构T 没有指定字段 b 。编译器就不会在乎在所有的C 的实际类型的。它所关心的是, C 是一个左值:某种可写内存块。

In this case the compiler would write 55 into an int value positioned at byte-offset 2 in the continuous memory block known as c, even though type struct T had no field named b. The compiler would not care about the actual type of c at all. All it cared about is that c was an lvalue: some sort of writable memory block.

现在请注意,如果你这样做

Now note that if you did this

S *s;
...
s.b = 42;

在code会被认为是有效的(因为取值也是一个左值)和编译器会简单地尝试将数据的写入指针<$ C $ç>取值本身的,在字节偏移2.不用说,这样的事情很容易导致内存溢出,但语言没有这样的事情关注自身。

the code would be considered valid (since s is also an lvalue) and the compiler would simply attempt to write data into the pointer s itself, at byte-offset 2. Needless to say, things like this could easily result in memory overrun, but the language did not concern itself with such matters.

即。在该版本的语言有关超载的运营商你提出的想法 指针类型是行不通的:。运营商 已经有使用指针(左值与指针或任何左值在所有)使用时非常特殊的含义。这是非常不可思议的功能,毫无疑问。但它存在的时候。

I.e. in that version of the language your proposed idea about overloading operator . for pointer types would not work: operator . already had very specific meaning when used with pointers (with lvalue pointers or with any lvalues at all). It was very weird functionality, no doubt. But it was there at the time.

当然,这种怪异的功能是不反对引入超载一个非常强有​​力的理由 运算符的指针(如你所说)在C返工版本 - K&安培;:R C.但它尚未完成。也许在那个时候出现了用C语言编写的CRM版本一些遗留code,必须予以支持。

Of course, this weird functionality is not a very strong reason against introducing overloaded . operator for pointers (as you suggested) in the reworked version of C - K&R C. But it hasn't been done. Maybe at that time there was some legacy code written in CRM version of C that had to be supported.

(对于1975年C参考手册的URL可能是不稳定的。另一个副本,可能会进行一些细微的差别,是的这里。)

(The URL for the 1975 C Reference Manual may not be stable. Another copy, possibly with some subtle differences, is here.)

这篇关于为什么箭头( - &GT;)经营者在存在C?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆