理解指针的障碍是什么?如何克服这些障碍? [英] What are the barriers to understanding pointers and what can be done to overcome them?

查看:32
本文介绍了理解指针的障碍是什么?如何克服这些障碍?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

对于 C 或 C++ 领域的许多新老大学生来说,为什么指针如此重要?是否有任何工具或思维过程可以帮助您了解指针在变量、函数和其他级别的工作原理?

有哪些好的做法可以让某人达到啊哈,我明白了"的水平,又不会让他们陷入整体概念的泥潭?基本上,像场景一样演练.

解决方案

指针是一个概念,很多人一开始可能会感到困惑,尤其是在复制指针值并仍然引用同一内存块时.

>

我发现最好的类比是将指针视为一张纸,上面有房屋地址,而它引用的内存块则是实际房屋.从而可以轻松解释各种操作.

我在下面添加了一些 Delphi 代码,并在适当的地方添加了一些注释.我选择了 Delphi,因为我的其他主要编程语言 C# 不会以同样的方式表现出内存泄漏等问题.

如果你只想学习指针的高级概念,那么你应该忽略下面解释中标有内存布局"的部分.它们旨在举例说明操作后内存的样子,但它们本质上更底层.但是,为了准确解释缓冲区溢出的实际工作原理,我添加了这些图表非常重要.

免责声明:出于所有意图和目的,本解释和示例记忆布局大大简化.你需要更多的开销和更多的细节需要知道是否需要在低级别的基础上处理内存.然而,对于解释内存和指针的意图,已经足够准确了.

<小时>

让我们假设下面使用的 THouse 类如下所示:

类型THouse = 班级私人的FName : 字符数组[0..9];民众构造函数创建(名称:PChar);结尾;

当您初始化房屋对象时,赋予构造函数的名称将复制到私有字段 FName 中.它被定义为固定大小的数组是有原因的.

在内存中,会有一些与房屋分配相关的开销,我将在下面这样说明:

<前>---[ttttNNNNNNNNNN]---^ ^|||+- FName 数组|+- 开销

tttt"区域是开销,对于各种类型的运行时和语言,通常会有更多的开销,例如 8 或 12 字节.除了内存分配器或核心系统例程之外,任何存储在该区域中的值都不得更改,这很重要,否则您可能会导致程序崩溃.

<小时>

分配内存

找一位企业家来建造你的房子,并给你房子的地址.与现实世界不同的是,内存分配不能被告知分配到哪里,而是会找到一个合适的有足够空间的位置,并将地址报告给分配的内存.

换句话说,企业家会选择地点.

THhouse.Create('我的房子');

内存布局:

<前>---[ttttNNNNNNNNNN]---第1234章我家

<小时>

用地址保存一个变量

在一张纸上写下您新房子的地址.这篇论文将作为您对房屋的参考.没有这张纸,你就迷路了,找不到房子,除非你已经在里面了.

varh: 房子;开始h := THhouse.Create('我的房子');...

内存布局:

<前>Hv---[ttttNNNNNNNNNN]---第1234章我家

<小时>

复制指针值

只需在一张新纸上写下地址即可.您现在有两张纸可以让您到达同一所房子,而不是两所不同的房子.任何试图按照一张纸上的地址并重新布置那所房子的家具都会使另一所房子看起来已经以相同的方式进行了修改,除非您可以明确地检测到它实际上只是一所房子.

注意这通常是我向人们解释最多的概念,两个指针并不意味着两个对象或内存块.

varh1, h2: 房子;开始h1 := THhouse.Create('我的房子');h2 := h1;//复制地址,而不是房子...

<前>小时1v---[ttttNNNNNNNNNN]---第1234章我家^小时2

<小时>

释放内存

拆房子.如果您愿意,您可以稍后将纸张重新用于新地址,或者清除它以忘记不再存在的房子的地址.

varh: 房子;开始h := THhouse.Create('我的房子');...h.免费;h := 零;

在这里,我首先建造了房子,并得到了它的地址.然后我对房子做一些事情(使用它,...代码,留给读者练习),然后我释放它.最后,我从变量中清除地址.

内存布局:

<前>h <--+v +- 自由前---[ttttNNNNNNNNNN]--- |第1234章h(现在无处可去)<--++- 免费后--------------- |(注意,内存可能仍然xx34My house <--+ 包含一些数据)

<小时>

悬空指针

你告诉你的企业家摧毁房子,但你忘记从你的纸上擦掉地址.后来你看那张纸的时候,你已经忘记房子已经不在了,去参观了,结果失败了(另见下面关于无效参考的部分).

varh: 房子;开始h := THhouse.Create('我的房子');...h.免费;...//这里忘记清除 hh.开放式前门;//很可能会失败

在调用 .Free 之后使用 h 可能 工作,但这只是纯粹的运气.它很可能会在客户处的关键操作中失败.

<前>h <--+v +- 自由前---[ttttNNNNNNNNNN]--- |第1234章h <--+v +- 免费后--------------- |xx34我的房子<--+

如你所见,h仍然指向内存中的剩余数据,但是因为它可能不完整,所以像以前一样使用它可能会失败.

<小时>

内存泄漏

你丢了那张纸,找不到房子.房子仍然矗立在某个地方,当你以后想要建造一座新房子时,你不能重复使用那个地方.

varh: 房子;开始h := THhouse.Create('我的房子');h := THhouse.Create('我的房子');//呃-哦,我们的第一所房子怎么了?...h.免费;h := 零;

这里我们用新房子的地址覆盖了 h 变量的内容,但旧的仍然存在......某处.在这个密码之后,就没有办法到达那个房子了,它就会被留下来.换句话说,分配的内存将一直保持分配状态,直到应用程序关闭,此时操作系统会将其销毁.

第一次分配后的内存布局:

<前>Hv---[ttttNNNNNNNNNN]---第1234章我家

第二次分配后的内存布局:

<前>Hv---[ttttNNNNNNNNNN]---[ttttNNNNNNNNNN]1234我的房子 5678我的房子

获得此方法的更常见方法是忘记释放某些东西,而不是像上面那样覆盖它.在 Delphi 中,这将通过以下方法发生:

过程 OpenTheFrontDoorOfANewHouse;无功h: 房子;开始h := THhouse.Create('我的房子');h.开放式前门;//呃-哦,这里没有.Free,地址去哪里了?结尾;

执行此方法后,我们的变量中没有房子地址存在的地方,但房子仍然在那里.

内存布局:

<前>h <--+v +- 丢失指针之前---[ttttNNNNNNNNNN]--- |第1234章h(现在无处可去)<--++- 丢失指针后---[ttttNNNNNNNNNN]--- |第1234章

如您所见,旧数据在内存中完好无损,不会被内存分配器重用.分配器跟踪哪个内存区域已被使用,除非您释放它.

<小时>

释放内存但保留(现在无效)引用

把房子拆了,把一张纸擦掉,但你还有另一张纸,上面写着旧地址,当你去地址时,你不会找到房子,但你可能会发现一些东西就像一个废墟.

也许您甚至会找到一所房子,但它不是您最初获得地址的房子,因此任何试图使用它好像它属于您的尝试都可能会失败.

有时您甚至可能会发现相邻地址上设置了一个相当大的房子,占据了三个地址(Main Street 1-3),而您的地址位于房子的中间.任何将三地址大房子的那部分视为单个小房子的尝试也可能会失败.

varh1, h2: 房子;开始h1 := THhouse.Create('我的房子');h2 := h1;//复制地址,而不是房子...h1.免费;h1 := 零;h2.OpenFrontDoor;//呃-哦,我们家怎么了?

这里的房子被拆掉了,通过h1中的引用,虽然h1也被清除了,h2仍然有旧的, 过时, 地址.进入不再站立的房子可能会也可能不会.

这是上面悬空指针的变体.查看它的内存布局.

<小时>

缓冲区溢出

你搬进房子里的东西比你可能容纳的要多,溅到邻居的房子或院子里.隔壁房子的主人后来回家时,他会发现各种各样的东西,他会认为是自己的.

这就是我选择固定大小数组的原因.要设置舞台,请假设我们分配的第二个房子,出于某种原因,会放在记忆中的第一个.换句话说,第二个房子将有一个较低的地址比第一个.此外,它们是紧挨着分配的.

因此,这段代码:

varh1, h2: 房子;开始h1 := THhouse.Create('我的房子');h2 := THhouse.Create('我在某处的另一所房子');^-------------------------^超过 10 个字符0123456789 <-- 10 个字符

第一次分配后的内存布局:

<前>小时1v-----------------------[ttttNNNNNNNNNN]5678我家

第二次分配后的内存布局:

<前>时 2 时 1v v---[ttttNNNNNNNNNN]----[ttttNNNNNNNNNN]第1234章^---+---^|+- 覆盖

最常导致崩溃的部分是当您覆盖重要部分时您存储的那些确实不应该随机更改的数据.例如h1-house的部分名字改了可能不是问题,在程序崩溃方面,但覆盖了当您尝试使用损坏的对象时,对象很可能会崩溃,将覆盖存储到的链接对象中的其他对象.

<小时>

链接列表

当您按照一张纸上的地址进行操作时,您会到达一所房子,在那所房子上还有另一张纸,上面写着新地址,链中的下一个房子,依此类推.

varh1, h2: 房子;开始h1 := THhouse.Create('Home');h2 := THhouse.Create('Cabin');h1.NextHouse := h2;

在这里,我们创建了一个从我们家到我们小屋的链接.我们可以沿着链一直走,直到房子没有 NextHouse 引用,这意味着它是最后一个.要访问我们所有的房子,我们可以使用以下代码:

varh1, h2: 房子;h: 房子;开始h1 := THhouse.Create('Home');h2 := THhouse.Create('Cabin');h1.NextHouse := h2;...h := h1;而 h <>无开始h.LockAllDoors;h.关闭所有窗口;h := h.NextHouse;结尾;

内存布局(添加 NextHouse 作为对象中的链接,用下图中的四个 LLLL):

<前>小时1小时2v v---[ttttNNNNNNNNNNLLLL]----[ttttNNNNNNNNNNLLLL]1234家+5678客舱+|^ |+--------+ *(无链接)

<小时>

简单来说,什么是内存地址?

内存地址在基本术语中只是一个数字.如果你想到记忆作为一个大的字节数组,第一个字节的地址为 0,下一个字节的地址为地址 1 以此类推.这是简化的,但已经足够了.

所以这个内存布局:

<前>小时1小时2v v---[ttttNNNNNNNNNN]---[ttttNNNNNNNNNN]1234我的房子 5678我的房子

可能有这两个地址(最左边-是地址0):

  • h1 = 4
  • h2 = 23

这意味着我们上面的链表实际上可能是这样的:

<前>h1 (=4) h2 (=28)v v---[ttttNNNNNNNNNNLLLL]----[ttttNNNNNNNNNNLLLL]1234家0028 5678客舱0000|^ |+--------+ *(无链接)

通常将无处指向"的地址存储为零地址.

<小时>

简单来说,什么是指针?

指针只是一个保存内存地址的变量.您通常可以询问编程语言给你它的编号,但大多数编程语言和运行时试图隐藏下面有一个数字的事实,只是因为数字本身没有真的对你有任何意义.最好将指针视为一个黑匣子,即.你并不真正知道或关心它是如何实际实施的,只要它有效.

Why are pointers such a leading factor of confusion for many new, and even old, college level students in C or C++? Are there any tools or thought processes that helped you understand how pointers work at the variable, function, and beyond level?

What are some good practice things that can be done to bring somebody to the level of, "Ah-hah, I got it," without getting them bogged down in the overall concept? Basically, drill like scenarios.

解决方案

Pointers is a concept that for many can be confusing at first, in particular when it comes to copying pointer values around and still referencing the same memory block.

I've found that the best analogy is to consider the pointer as a piece of paper with a house address on it, and the memory block it references as the actual house. All sorts of operations can thus be easily explained.

I've added some Delphi code down below, and some comments where appropriate. I chose Delphi since my other main programming language, C#, does not exhibit things like memory leaks in the same way.

If you only wish to learn the high-level concept of pointers, then you should ignore the parts labelled "Memory layout" in the explanation below. They are intended to give examples of what memory could look like after operations, but they are more low-level in nature. However, in order to accurately explain how buffer overruns really work, it was important that I added these diagrams.

Disclaimer: For all intents and purposes, this explanation and the example memory layouts are vastly simplified. There's more overhead and a lot more details you would need to know if you need to deal with memory on a low-level basis. However, for the intents of explaining memory and pointers, it is accurate enough.


Let's assume the THouse class used below looks like this:

type
    THouse = class
    private
        FName : array[0..9] of Char;
    public
        constructor Create(name: PChar);
    end;

When you initialize the house object, the name given to the constructor is copied into the private field FName. There is a reason it is defined as a fixed-size array.

In memory, there will be some overhead associated with the house allocation, I'll illustrate this below like this:

---[ttttNNNNNNNNNN]---
     ^   ^
     |   |
     |   +- the FName array
     |
     +- overhead

The "tttt" area is overhead, there will typically be more of this for various types of runtimes and languages, like 8 or 12 bytes. It is imperative that whatever values are stored in this area never gets changed by anything other than the memory allocator or the core system routines, or you risk crashing the program.


Allocate memory

Get an entrepreneur to build your house, and give you the address to the house. In contrast to the real world, memory allocation cannot be told where to allocate, but will find a suitable spot with enough room, and report back the address to the allocated memory.

In other words, the entrepreneur will choose the spot.

THouse.Create('My house');

Memory layout:

---[ttttNNNNNNNNNN]---
    1234My house


Keep a variable with the address

Write the address to your new house down on a piece of paper. This paper will serve as your reference to your house. Without this piece of paper, you're lost, and cannot find the house, unless you're already in it.

var
    h: THouse;
begin
    h := THouse.Create('My house');
    ...

Memory layout:

    h
    v
---[ttttNNNNNNNNNN]---
    1234My house


Copy pointer value

Just write the address on a new piece of paper. You now have two pieces of paper that will get you to the same house, not two separate houses. Any attempts to follow the address from one paper and rearrange the furniture at that house will make it seem that the other house has been modified in the same manner, unless you can explicitly detect that it's actually just one house.

Note This is usually the concept that I have the most problem explaining to people, two pointers does not mean two objects or memory blocks.

var
    h1, h2: THouse;
begin
    h1 := THouse.Create('My house');
    h2 := h1; // copies the address, not the house
    ...

    h1
    v
---[ttttNNNNNNNNNN]---
    1234My house
    ^
    h2


Freeing the memory

Demolish the house. You can then later on reuse the paper for a new address if you so wish, or clear it to forget the address to the house that no longer exists.

var
    h: THouse;
begin
    h := THouse.Create('My house');
    ...
    h.Free;
    h := nil;

Here I first construct the house, and get hold of its address. Then I do something to the house (use it, the ... code, left as an exercise for the reader), and then I free it. Lastly I clear the address from my variable.

Memory layout:

    h                        <--+
    v                           +- before free
---[ttttNNNNNNNNNN]---          |
    1234My house             <--+

    h (now points nowhere)   <--+
                                +- after free
----------------------          | (note, memory might still
    xx34My house             <--+  contain some data)


Dangling pointers

You tell your entrepreneur to destroy the house, but you forget to erase the address from your piece of paper. When later on you look at the piece of paper, you've forgotten that the house is no longer there, and goes to visit it, with failed results (see also the part about an invalid reference below).

var
    h: THouse;
begin
    h := THouse.Create('My house');
    ...
    h.Free;
    ... // forgot to clear h here
    h.OpenFrontDoor; // will most likely fail

Using h after the call to .Free might work, but that is just pure luck. Most likely it will fail, at a customers place, in the middle of a critical operation.

    h                        <--+
    v                           +- before free
---[ttttNNNNNNNNNN]---          |
    1234My house             <--+

    h                        <--+
    v                           +- after free
----------------------          |
    xx34My house             <--+

As you can see, h still points to the remnants of the data in memory, but since it might not be complete, using it as before might fail.


Memory leak

You lose the piece of paper and cannot find the house. The house is still standing somewhere though, and when you later on want to construct a new house, you cannot reuse that spot.

var
    h: THouse;
begin
    h := THouse.Create('My house');
    h := THouse.Create('My house'); // uh-oh, what happened to our first house?
    ...
    h.Free;
    h := nil;

Here we overwrote the contents of the h variable with the address of a new house, but the old one is still standing... somewhere. After this code, there is no way to reach that house, and it will be left standing. In other words, the allocated memory will stay allocated until the application closes, at which point the operating system will tear it down.

Memory layout after first allocation:

    h
    v
---[ttttNNNNNNNNNN]---
    1234My house

Memory layout after second allocation:

                       h
                       v
---[ttttNNNNNNNNNN]---[ttttNNNNNNNNNN]
    1234My house       5678My house

A more common way to get this method is just to forget to free something, instead of overwriting it as above. In Delphi terms, this will occur with the following method:

procedure OpenTheFrontDoorOfANewHouse;
var
    h: THouse;
begin
    h := THouse.Create('My house');
    h.OpenFrontDoor;
    // uh-oh, no .Free here, where does the address go?
end;

After this method has executed, there's no place in our variables that the address to the house exists, but the house is still out there.

Memory layout:

    h                        <--+
    v                           +- before losing pointer
---[ttttNNNNNNNNNN]---          |
    1234My house             <--+

    h (now points nowhere)   <--+
                                +- after losing pointer
---[ttttNNNNNNNNNN]---          |
    1234My house             <--+

As you can see, the old data is left intact in memory, and will not be reused by the memory allocator. The allocator keeps track of which areas of memory has been used, and will not reuse them unless you free it.


Freeing the memory but keeping a (now invalid) reference

Demolish the house, erase one of the pieces of paper but you also have another piece of paper with the old address on it, when you go to the address, you won't find a house, but you might find something that resembles the ruins of one.

Perhaps you will even find a house, but it is not the house you were originally given the address to, and thus any attempts to use it as though it belongs to you might fail horribly.

Sometimes you might even find that a neighbouring address has a rather big house set up on it that occupies three address (Main Street 1-3), and your address goes to the middle of the house. Any attempts to treat that part of the large 3-address house as a single small house might also fail horribly.

var
    h1, h2: THouse;
begin
    h1 := THouse.Create('My house');
    h2 := h1; // copies the address, not the house
    ...
    h1.Free;
    h1 := nil;
    h2.OpenFrontDoor; // uh-oh, what happened to our house?

Here the house was torn down, through the reference in h1, and while h1 was cleared as well, h2 still has the old, out-of-date, address. Access to the house that is no longer standing might or might not work.

This is a variation of the dangling pointer above. See its memory layout.


Buffer overrun

You move more stuff into the house than you can possibly fit, spilling into the neighbours house or yard. When the owner of that neighbouring house later on comes home, he'll find all sorts of things he'll consider his own.

This is the reason I chose a fixed-size array. To set the stage, assume that the second house we allocate will, for some reason, be placed before the first one in memory. In other words, the second house will have a lower address than the first one. Also, they're allocated right next to each other.

Thus, this code:

var
    h1, h2: THouse;
begin
    h1 := THouse.Create('My house');
    h2 := THouse.Create('My other house somewhere');
                         ^-----------------------^
                          longer than 10 characters
                         0123456789 <-- 10 characters

Memory layout after first allocation:

                        h1
                        v
-----------------------[ttttNNNNNNNNNN]
                        5678My house

Memory layout after second allocation:

    h2                  h1
    v                   v
---[ttttNNNNNNNNNN]----[ttttNNNNNNNNNN]
    1234My other house somewhereouse
                        ^---+--^
                            |
                            +- overwritten

The part that will most often cause crash is when you overwrite important parts of the data you stored that really should not be randomly changed. For instance it might not be a problem that parts of the name of the h1-house was changed, in terms of crashing the program, but overwriting the overhead of the object will most likely crash when you try to use the broken object, as will overwriting links that is stored to other objects in the object.


Linked lists

When you follow an address on a piece of paper, you get to a house, and at that house there is another piece of paper with a new address on it, for the next house in the chain, and so on.

var
    h1, h2: THouse;
begin
    h1 := THouse.Create('Home');
    h2 := THouse.Create('Cabin');
    h1.NextHouse := h2;

Here we create a link from our home house to our cabin. We can follow the chain until a house has no NextHouse reference, which means it's the last one. To visit all our houses, we could use the following code:

var
    h1, h2: THouse;
    h: THouse;
begin
    h1 := THouse.Create('Home');
    h2 := THouse.Create('Cabin');
    h1.NextHouse := h2;
    ...
    h := h1;
    while h <> nil do
    begin
        h.LockAllDoors;
        h.CloseAllWindows;
        h := h.NextHouse;
    end;

Memory layout (added NextHouse as a link in the object, noted with the four LLLL's in the below diagram):

    h1                      h2
    v                       v
---[ttttNNNNNNNNNNLLLL]----[ttttNNNNNNNNNNLLLL]
    1234Home       +        5678Cabin      +
                   |        ^              |
                   +--------+              * (no link)


In basic terms, what is a memory address?

A memory address is in basic terms just a number. If you think of memory as a big array of bytes, the very first byte has the address 0, the next one the address 1 and so on upwards. This is simplified, but good enough.

So this memory layout:

    h1                 h2
    v                  v
---[ttttNNNNNNNNNN]---[ttttNNNNNNNNNN]
    1234My house       5678My house

Might have these two address (the leftmost - is address 0):

  • h1 = 4
  • h2 = 23

Which means that our linked list above might actuall look like this:

    h1 (=4)                 h2 (=28)
    v                       v
---[ttttNNNNNNNNNNLLLL]----[ttttNNNNNNNNNNLLLL]
    1234Home      0028      5678Cabin     0000
                   |        ^              |
                   +--------+              * (no link)

It is typical to store an address that "points nowhere" as a zero-address.


In basic terms, what is a pointer?

A pointer is just a variable holding a memory address. You can typically ask the programming language to give you its number, but most programming languages and runtimes tries to hide the fact that there is a number beneath, just because the number itself does not really hold any meaning to you. It is best to think of a pointer as a black box, ie. you don't really know or care about how it is actually implemented, just as long as it works.

这篇关于理解指针的障碍是什么?如何克服这些障碍?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆