为什么我们使用数组而不是其他数据结构? [英] Why do we use arrays instead of other data structures?

查看:26
本文介绍了为什么我们使用数组而不是其他数据结构?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在我编程时,我还没有看到数组比另一种形式更适合存储信息的实例.我确实认为编程语言中添加的功能"在此基础上有所改进,并由此取代了它们.我现在明白了,他们没有被取代,而是被赋予了新的生命,可以这么说.

As I was programming, I haven't seen an instance where an array is better for storing information than another form thereof. I had indeed figured the added "features" in programming languages had improved upon this and by that replaced them. I see now that they aren't replaced but rather given new life, so to speak.

那么,基本上,使用数组的意义何在?

So, basically, what's the point of using arrays?

这不是为什么我们从计算机的角度使用数组,而是为什么我们从编程的角度使用数组(一个细微的区别).计算机如何处理数组并不是问题的重点.

This is not so much why do we use arrays from a computer standpoint, but rather why would we use arrays from a programming standpoint (a subtle difference). What the computer does with the array was not the point of the question.

推荐答案

是时候回到过去上课了.虽然我们今天在我们花哨的托管语言中很少考虑这些事情,但它们建立在相同的基础上,所以让我们看看在 C 中是如何管理内存的.

Time to go back in time for a lesson. While we don't think about these things much in our fancy managed languages today, they are built on the same foundation, so let's look at how memory is managed in C.

在我深入研究之前,快速解释一下术语指针"的含义.指针只是一个指向"内存中某个位置的变量.它不包含该内存区域的实际值,它包含它的内存地址.将内存块视为邮箱.指针将是该邮箱的地址.

Before I dive in, a quick explanation of what the term "pointer" means. A pointer is simply a variable that "points" to a location in memory. It doesn't contain the actual value at this area of memory, it contains the memory address to it. Think of a block of memory as a mailbox. The pointer would be the address to that mailbox.

在 C 中,数组只是一个带有偏移量的指针,偏移量指定在内存中查看多远.这提供了 O(1) 访问时间.

In C, an array is simply a pointer with an offset, the offset specifies how far in memory to look. This provides O(1) access time.

  MyArray   [5]
     ^       ^
  Pointer  Offset

所有其他数据结构要么以此为基础,要么不使用相邻内存进行存储,从而导致随机访问查找时间很短(尽管不使用顺序内存还有其他好处).

All other data structures either build upon this, or do not use adjacent memory for storage, resulting in poor random access look up time (Though there are other benefits to not using sequential memory).

例如,假设我们有一个包含 6 个数字 (6,4,2,3,1,5) 的数组,在内存中它看起来像这样:

For example, let's say we have an array with 6 numbers (6,4,2,3,1,5) in it, in memory it would look like this:

=====================================
|  6  |  4  |  2  |  3  |  1  |  5  |
=====================================

在数组中,我们知道每个元素在内存中都是相邻的.C 数组(此处称为 MyArray)只是指向第一个元素的指针:

In an array, we know that each element is next to each other in memory. A C array (Called MyArray here) is simply a pointer to the first element:

=====================================
|  6  |  4  |  2  |  3  |  1  |  5  |
=====================================
   ^
MyArray

如果我们想查找MyArray[4],内部会像这样访问它:

If we wanted to look up MyArray[4], internally it would be accessed like this:

   0     1     2     3     4 
=====================================
|  6  |  4  |  2  |  3  |  1  |  5  |
=====================================
                           ^
MyArray + 4 ---------------/
(Pointer + Offset)

因为我们可以通过给指针加上偏移量来直接访问数组中的任何元素,所以我们可以在相同的时间内查找任何元素,而不管数组的大小.这意味着获取 MyArray[1000] 将花费与获取 MyArray[5] 相同的时间.

Because we can directly access any element in the array by adding the offset to the pointer, we can look up any element in the same amount of time, regardless of the size of the array. This means that getting MyArray[1000] would take the same amount of time as getting MyArray[5].

另一种数据结构是链表.这是一个线性指针列表,每个指针指向下一个节点

An alternative data structure is a linked list. This is a linear list of pointers, each pointing to the next node

========    ========    ========    ========    ========
| Data |    | Data |    | Data |    | Data |    | Data |
|      | -> |      | -> |      | -> |      | -> |      | 
|  P1  |    |  P2  |    |  P3  |    |  P4  |    |  P5  |        
========    ========    ========    ========    ========

P(X) stands for Pointer to next node.

请注意,我将每个节点"都放入了自己的块中.这是因为它们不能保证(并且很可能不会)在内存中相邻.

Note that I made each "node" into its own block. This is because they are not guaranteed to be (and most likely won't be) adjacent in memory.

如果我想访问P3,我不能直接访问它,因为我不知道它在内存中的什么地方.我所知道的是根 (P1) 在哪里,所以我必须从 P1 开始,并跟随每个指向所需节点的指针.

If I want to access P3, I can't directly access it, because I don't know where it is in memory. All I know is where the root (P1) is, so instead I have to start at P1, and follow each pointer to the desired node.

这是一个 O(N) 的查找时间(查找成本随着每个元素的添加而增加).与达到 P4 相比,达到 P1000 的成本要高得多.

This is a O(N) look up time (The look up cost increases as each element is added). It is much more expensive to get to P1000 compared to getting to P4.

更高级的数据结构,例如哈希表、堆栈和队列,内部都可能使用一个数组(或多个数组),而链表和二叉树通常使用节点和指针.

Higher level data structures, such as hashtables, stacks and queues, all may use an array (or multiple arrays) internally, while Linked Lists and Binary Trees usually use nodes and pointers.

您可能想知道为什么有人会使用需要线性遍历的数据结构来查找值而不是仅使用数组,但它们有其用途.

You might wonder why anyone would use a data structure that requires linear traversal to look up a value instead of just using an array, but they have their uses.

再次获取我们的数组.这一次,我想找到保存值为 '5' 的数组元素.

Take our array again. This time, I want to find the array element that holds the value '5'.

=====================================
|  6  |  4  |  2  |  3  |  1  |  5  |
=====================================
   ^     ^     ^     ^     ^   FOUND!

在这种情况下,我不知道要向指针添加什么偏移量才能找到它,所以我必须从 0 开始,然后一直向上直到找到它.这意味着我必须执行 6 项检查.

In this situation, I don't know what offset to add to the pointer to find it, so I have to start at 0, and work my way up until I find it. This means I have to perform 6 checks.

因此,在数组中搜索一个值被认为是 O(N).搜索成本随着数组变大而增加.

Because of this, searching for a value in an array is considered O(N). The cost of searching increases as the array gets larger.

还记得上面我说过有时使用非顺序数据结构会有好处吗?搜索数据是这些优势之一,最好的例子之一就是二叉树.

Remember up above where I said that sometimes using a non sequential data structure can have advantages? Searching for data is one of these advantages and one of the best examples is the Binary Tree.

二叉树是一种类似于链表的数据结构,但不是链接到单个节点,而是每个节点可以链接到两个子节点.

A Binary Tree is a data structure similar to a linked list, however instead of linking to a single node, each node can link to two children nodes.

         ==========
         |  Root  |         
         ==========
        /          \ 
  =========       =========
  | Child |       | Child |
  =========       =========
                  /       \
            =========    =========
            | Child |    | Child |
            =========    =========

 Assume that each connector is really a Pointer

当数据插入二叉树时,它使用几个规则来决定放置新节点的位置.基本概念是,如果新值大于父值,则将其插入到左侧,如果较低,则将其插入到右侧.

When data is inserted into a binary tree, it uses several rules to decide where to place the new node. The basic concept is that if the new value is greater than the parents, it inserts it to the left, if it is lower, it inserts it to the right.

这意味着二叉树中的值可能如下所示:

This means that the values in a binary tree could look like this:

         ==========
         |   100  |         
         ==========
        /          \ 
  =========       =========
  |  200  |       |   50  |
  =========       =========
                  /       \
            =========    =========
            |   75  |    |   25  |
            =========    =========

在二叉树中搜索 75 的值时,我们只需要访问 3 个节点( O(log N) ),因为这种结构:

When searching a binary tree for the value of 75, we only need to visit 3 nodes ( O(log N) ) because of this structure:

  • 75 是否小于 100?查看右节点
  • 75 是否大于 50?查看左节点
  • 有 75 个!

即使我们的树中有 5 个节点,我们也不需要查看剩余的两个节点,因为我们知道它们(及其子节点)不可能包含我们正在寻找的值.这给了我们一个搜索时间,在最坏的情况下意味着我们必须访问每个节点,但在最好的情况下我们只需要访问节点的一小部分.

Even though there are 5 nodes in our tree, we did not need to look at the remaining two, because we knew that they (and their children) could not possibly contain the value we were looking for. This gives us a search time that at worst case means we have to visit every node, but in the best case we only have to visit a small portion of the nodes.

这就是数组被击败的地方,尽管访问时间为 O(1),但它们提供了线性 O(N) 搜索时间.

That is where arrays get beat, they provide a linear O(N) search time, despite O(1) access time.

这是对内存中数据结构的令人难以置信的高级概述,跳过了很多细节,但希望它能说明与其他数据结构相比数组的优势和劣势.

This is an incredibly high level overview on data structures in memory, skipping over a lot of details, but hopefully it illustrates an array's strength and weakness compared to other data structures.

这篇关于为什么我们使用数组而不是其他数据结构?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆