编译器级别的C ++引用和指针 [英] C++ references and pointers at the compiler level

查看:164
本文介绍了编译器级别的C ++引用和指针的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试学习C ++编译器如何处理引用和指针,为下学期准备的编译器类做准备.我对编译器如何处理C ++中的引用特别感兴趣.

I'm trying to learn how C++ compilers handle references and pointers, in preparation for a compiler class that I'm taking next semester. I'm specifically interested in how compilers handle references in C++.

该标准指定引用是别名",但是我不知道在编译器级别这到底意味着什么.我有两种理论:

The standard specifies that a reference is an "alias," but I don't know exactly what that means at the compiler level. I have two theories:

  1. 非引用变量在符号表中具有一个条目.创建对该变量的引用后,编译器仅创建另一个词素,它指向"符号表中完全相同的条目(而不是非引用变量在内存中的位置).

  1. A non-reference variable has an entry in the symbol table. When a reference to that variable is created, the compiler simply creates another lexeme that "points" to the exact same entry in the symbol table (and not to the non-reference variable's location in memory).

创建对该变量的引用后,编译器会创建一个指向该变量在内存中位置的指针.解析语言上下文时,将处理对引用的限制(无null值等).换句话说,对于已取消引用的指针,引用是语法糖".

When a reference to that variable is created, the compiler creates a pointer to that variable's location in memory. The limitations on references (no null values, etc.) are handled when parsing the context of the language. In other words, a reference is "syntactic sugar" for a dereferenced pointer.

据我所知,这两种解决方案都会创建一个别名".编译器使用一个而不使用另一个吗?还是依赖于编译器?

Both solutions would create an "alias," as far as I can tell. Do compilers use one and not the other? Or is it compiler-dependent?

顺便说一句,我知道在机器语言级别上,它们都是指针"(除了整数以外,几乎所有其他东西在机器级别上都是指针").我对生成机器代码之前编译器的工作很感兴趣.

As an aside, I'm aware that at the machine-language level, both are "pointers" (pretty much everything other than an integer is a "pointer" at the machine level). I'm interested in what the compiler does before the machine code is generated.

我很好奇的部分原因是因为 PHP使用方法#1 ,我想知道C ++编译器是否以相同的方式工作. Java当然不使用方法#1,并且它们的引用"实际上是取消引用的指针.请参阅Scott Stanchfield的本文.

Part of the reason I am curious is because PHP uses method #1, and I'm wondering if C++ compilers work the same way. Java certainly does not use method #1, and their "references" are in fact dereferenced pointers; see this article by Scott Stanchfield.

推荐答案

我将尝试解释g ++编译器如何实现引用.

I will try to explain how references are implemented by g++ compiler.

    #include <iostream>

    using namespace std;

    int main()
    {
        int i = 10;
        int *ptrToI = &i;
        int &refToI = i;

        cout << "i = " << i << "\n";
        cout << "&i = " << &i << "\n";

        cout << "ptrToI = " << ptrToI << "\n";
        cout << "*ptrToI = " << *ptrToI << "\n";
        cout << "&ptrToI = " << &ptrToI << "\n";

        cout << "refToNum = " << refToI << "\n";
        //cout << "*refToNum = " << *refToI << "\n";
        cout << "&refToNum = " << &refToI << "\n";

        return 0;
    }

此代码的输出是这样

    i = 10
    &i = 0xbf9e52f8
    ptrToI = 0xbf9e52f8
    *ptrToI = 10
    &ptrToI = 0xbf9e52f4
    refToNum = 10
    &refToNum = 0xbf9e52f8

让我们看看反汇编(我为此使用了GDB.8,9和10是代码的行号)

Lets look at the disassembly(I used GDB for this. 8,9 and 10 here are line numbers of code)

8           int i = 10;
0x08048698 <main()+18>: movl   $0xa,-0x10(%ebp)

这里$0xa是我们分配给i的10(十进制). -0x10(%ebp)在这里表示ebp register –16(十进制)的内容. -0x10(%ebp)指向堆栈上i的地址.

Here $0xa is the 10(decimal) that we are assigning to i. -0x10(%ebp) here means content of ebp register –16(decimal). -0x10(%ebp) points to the address of i on stack.

9           int *ptrToI = &i;
0x0804869f <main()+25>: lea    -0x10(%ebp),%eax
0x080486a2 <main()+28>: mov    %eax,-0x14(%ebp)

i的地址分配给ptrToI. ptrToI再次位于地址-0x14(%ebp)的堆栈中,即ebp – 20(十进制).

Assign address of i to ptrToI. ptrToI is again on stack located at address -0x14(%ebp), that is ebp – 20(decimal).

10          int &refToI = i;
0x080486a5 <main()+31>: lea    -0x10(%ebp),%eax
0x080486a8 <main()+34>: mov    %eax,-0xc(%ebp)

现在这是要抓住的地方!比较第9行和第10行的反汇编,您会发现第10行中的-0x14(%ebp)-0xc(%ebp)替换,-0xc(%ebp)refToNum的地址.它是在堆栈上分配的.但是您将永远无法从代码中获取该地址,因为您不需要知道该地址.

Now here is the catch! Compare disassembly of line 9 and 10 and you will observer that ,-0x14(%ebp) is replaced by -0xc(%ebp) in line number 10. -0xc(%ebp) is the address of refToNum. It is allocated on stack. But you will never be able to get this address from you code because you are not required to know the address.

所以;引用确实占用内存.在这种情况下,它是堆栈内存,因为我们已将其分配为局部变量. 它占用多少内存? 指针占用了很多.

So; a reference does occupy memory. In this case it is the stack memory since we have allocated it as a local variable. How much memory does it occupy? As much a pointer occupies.

现在让我们看看如何访问引用和指针.为简单起见,我仅显示了一部分汇编代码

Now lets see how we access the reference and pointers. For simplicity I have shown only part of the assembly snippet

16          cout << "*ptrToI = " << *ptrToI << "\n";
0x08048746 <main()+192>:        mov    -0x14(%ebp),%eax
0x08048749 <main()+195>:        mov    (%eax),%ebx
19          cout << "refToNum = " << refToI << "\n";
0x080487b0 <main()+298>:        mov    -0xc(%ebp),%eax
0x080487b3 <main()+301>:        mov    (%eax),%ebx

现在比较上面两行,您将看到惊人的相似性. -0xc(%ebp)refToI的实际地址,您无法访问. 简单来说,如果您将引用视为普通指针,则访问引用就像在引用指向的地址处获取值.这意味着下面两行代码将为您提供相同的结果

Now compare the above two lines, you will see striking similarity. -0xc(%ebp) is the actual address of refToI which is never accessible to you. In simple terms, if you think of reference as a normal pointer, then accessing a reference is like fetching the value at address pointed to by the reference. Which means the below two lines of code will give you the same result

cout << "Value if i = " << *ptrToI << "\n";
cout << " Value if i = " << refToI << "\n";

现在比较

15          cout << "ptrToI = " << ptrToI << "\n";
0x08048713 <main()+141>:        mov    -0x14(%ebp),%ebx
21          cout << "&refToNum = " << &refToI << "\n";
0x080487fb <main()+373>:        mov    -0xc(%ebp),%eax

我想您能够发现这里发生的事情. 如果要求输入&refToI,则返回-0xc(%ebp)地址位置的内容,并且-0xc(%ebp)refToi所在的位置,其内容不过是i的地址.

I guess you are able to spot what is happening here. If you ask for &refToI, the contents of -0xc(%ebp) address location are returned and -0xc(%ebp) is where refToi resides and its contents are nothing but address of i.

最后一件事,为什么要注释此行?

One last thing, Why is this line commented?

//cout << "*refToNum = " << *refToI << "\n";

因为不允许使用*refToI,它会给您一个编译时错误.

Because *refToI is not permitted and it will give you a compile time error.

这篇关于编译器级别的C ++引用和指针的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆