JavaScript数组在内部如何调整大小? [英] How are the JavaScript Arrays internally resizing?

查看:56
本文介绍了JavaScript数组在内部如何调整大小?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直试图在JavaScript中实现具有一些自定义功能的类的集合类型(类似于C#中的List).我还希望对其进行某种程度的优化(我已经阅读了一些有关如何正确使用JavaScript数组的文章).

I've been trying to implement a collection type of class (similar to List found in C#) in JavaScript that has some custom functionalities. I also wanted it to be somewhat optimized (I've read some articles on how to properly use JavaScript Arrays).

我对自己说:如果我们不为数组定义初始大小,并且继续向其添加对象,则内部必须为每次插入分配新的大小,这必须很慢.我可以通过自己分配一个新的大小(改变数组长度)来避免这种情况,这有点类似于在C#中的做法,每当达到最大容量时大小都会加倍(我知道这不是小事,但这只是一个开始)"

I thought to myself "if we don't define an initial size to an Array and we keep adding objects to it, internally it will have to allocate a new size for each insertion, that must be slow. I can avoid this by allocating a new size myself (changing the array length), somewhat similar to how it is done in C#, doubling in size whenever the max capacity is reached (I know it's not this trivial but it's a start)".

我尝试实现这个想法,发现它慢得多(大约慢10倍):

I tried to implement this idea and found out that it is way slower (about 10 times slower):

// This simplified approach of my implementation is faster...
var array = [];
var counter = 0;
function addItem(newItem) {
    array[++counter] = newItem;
}

// ...then this version that resizes the array when a limit is reached
var array = [];
array.length = INITIAL_SIZE;
/*
 Alternatively
 var array = new Array(INITIAL_SIZE);
*/
var counter = 0;
function addItem(newItem) {
    if( CheckCapacity(counter + 1) ) { // Function that checks if the maximum size is reached and if it is, change the array.length to the new size
        array[++counter] = newItem;
    }
}

在对此进行测试之前,我对自己说:由于在调用CheckCapacity(counter + 1)时数组具有新的大小,因此在内部它(JavaScript数组)将不需要进行太多的操作到第一个函数,因为我确保有足够的可用空间",即第二个函数上的 array [++ counter] = newItem 行应比同一个函数快在第一个功能中.

Before testing this, I thought to myself, "since I've a new size for the array when I call CheckCapacity(counter + 1), internally it (JavaScript Array) won't have to make as much operations compared to the first function since I make sure that there is space available, more than necessary", i.e., the array[++counter] = newItem line on the second function should be faster compared to the same one in the first function.

我什至使用了不同的数组,其中包含预先计算的大小,用于存放这些物品;还是比较慢.

I've even used different arrays which contained pre-calculated sizes for the one holding the items; it still was slower.

回到我的问题,JavaScript Array的实现如何分配必要的大小?我是否可以正确地假设不能做太多事情来加快此过程?对我来说,每次添加新项时动态分配更多内存的对象(JavaScript数组)的弊端就是速度的损失(除非它实现了很好的算法,但是我不知道)不知道,因此是我的问题.

So back to my question, how is the implementation of a JavaScript Array allocating the necessary size? Am I correct to assume that not much can be done to speed this process up? To me it made sense that the of the drawbacks of having an object (the JavaScript Array) that dynamically allocates more memory each time a new item is added, would be the loss of speed (unless it has pretty good algorithms implemented, but I don't know, hence my question).

推荐答案

在JavaScript中,数组是一种抽象.JavaScript引擎的实现方式(以及执行分配和调整大小的方式)由JavaScript引擎决定-ECMAScript规范并未规定如何实现.因此,基本上没有精确的方法知道.

In JavaScript, an Array is an abstraction. How it is implemented (and when allocation and resizing is performed) is left up to the JavaScript engine - the ECMAScript specification does not dictate how this is done. So there is basically no precise way to know.

在实践中,JavaScript引擎非常聪明地了解如何分配内存,并确保不要分配太多.在我看来,它们比C#的 List 复杂得多-因为JavaScript引擎可以根据情况动态更改基础数据结构.算法各不相同,但是大多数将考虑是否存在任何漏洞".在您的数组中:

In practice, JavaScript engines are very clever about how the allocate memory and the make sure not to allocate too much. In my opinion, they are far more sophisticated than C#'s List -- because JavaScript engines can dynamically change the underlying data structure depending on the situation. The algorithms vary, but most will consider whether there are any "holes" in your array:

var array = [];
array[0] = "foo"          // Is a resizable array
array[1] = "bar"          // Is a resizable array
array[2] = "baz"          // Is a resizable array
array[1000000] = "hello"; // Is now a hash table
console.log(array[1000000]) // "hello"

如果您正常使用数组并使用从零开始的连续键,则不会有空洞".并且大多数JavaScript引擎将使用可调整大小的数组数据结构来表示JavaScript数组.现在考虑第四项任务,我创建了一个所谓的孔".尺寸大约为一百万(该孔跨越插槽3-999999).事实证明,JavaScript引擎足够聪明,不会为这个巨大的漏洞分配约100万个内存插槽.它检测到我们现在有一个空洞,它将使用字典/哈希表之类的数据结构表示JavaScript数组(它使用二进制搜索树,其中的键被散列)以节省空间.它不会存储该孔的空间,仅存储四个映射:(0,"foo")(1,"bar")(2,"baz")(1000000,"hello").

If you use arrays normally and use contiguous keys starting at zero, then there are no "holes" and most JavaScript engines will represent the JavaScript array by using a resizable array data structure. Now consider the fourth assignment, I've created a so-called "hole" of roughly a size of a million (the hole spans slots 3-999999). It turns out, JavaScript engines are clever enough not to allocate ~1 million slots in memory for this massive hole. It detects that we have a hole, it will now, represent the JavaScript array using a Dictionary / hash-table like data structure (it uses a binary search tree where the keys are hashed) to save space. It won't store space for the hole, just four mappings: (0, "foo"), (1, "bar"), (2, "baz"), (1000000, "hello").

不幸的是,对于引擎而言,访问Array的速度现在变慢了,因为它现在必须计算散列并遍历树.如果没有孔,则使用可调整大小的数组,访问时间更快,但是,如果有孔,则阵列的性能会降低.通用术语是说Array是密集阵列,当它是没有任何孔时(它使用可调整大小的数组=更好的性能),而Array是稀疏数组,当它带有一个或多个孔时(它使用哈希表=性能降低).通常,为了获得最佳性能,请尝试使用密集数组.

Unfortunately, accessing the Array is now slower for the engine because it will now have to compute a hash and traverse a tree. When there are no holes, we use a resizable array and we have quicker access times, but when we have a hole the Array's performance is slower. The common terminology is to say an Array is a dense array, when it is without any holes (it uses a resizable array = better performance), and an Array is a sparse array, when it with one or more holes (it uses a hash table = slower performance). For best performance in general, try to use dense arrays.

现在结束,让我告诉您以下是一个坏主意:

Now to finish off, let me tell you that the following is a bad idea:

var array = new Array(1000000);
array[0] = "foo";               // Is a hash table

上面的数组有一个大小约为100万的孔(就像这样: [" foo,undefined,undefined,... undefined] ),因此,它使用了哈希表作为基础数据结构.因此,自己实施调整大小是一个坏主意-这会造成漏洞,并导致性能不佳.您只是在混淆JavaScript引擎.

The array above has a hole of size ~1 million (it's like this: ["foo", undefined, undefined, ... undefined]) and so therefore, it is using a hash-table as the underlying data structure. So implementing the resizing yourself is a bad idea - it will create a hole and cause worst performance than better. You're only confusing the JavaScript engine.

这就是您的代码正在做的事情,您的数组中始终有一个洞,因此使用哈希表作为基础数据结构;与没有任何孔的阵列(又称代码的第一版)相比,性能会降低.

This is what your code was doing, your array always had a hole in it and therefore was using a hash table as the underlying data structure; giving slower performance compared to an array without any holes (aka the first version of your code).

我是否正确地假设不能做太多事情来加快此过程?

Am I correct to assume that not much can be done to speed this process up?

,关于用户空间的预先分配,在用户方面几乎没有什么可做的.通常,要加快JavaScript数组的速度,您要避免创建稀疏数组(避免创建空洞):

Yes, there is little to be done on the user's side regarding pre-allocation of space. To speed up JavaScript arrays in general you want to avoid creating sparse arrays (avoid created holes):

  1. 请勿使用 new Array(size)进行预分配.而是随您成长".引擎将计算出可调整大小的基础数组本身的大小.
  2. 使用从0开始的连续整数键.不要从大整数开始.请勿添加非整数的键(例如,请勿使用字符串作为键).
  3. 尽量不要删除数组中间的键(不要从填充了索引0-9的数组中删除索引5的元素).
  4. 请勿在密集数组和稀疏数组之间进行转换(即不要重复添加和删除孔).引擎与可调整大小的数组与哈希表表示形式之间来回转换会产生开销.
  1. Don't pre-allocate using new Array(size). Instead "grow as you go". The engine will work out the size of the underlying resizable array itself.
  2. Use contiguous integer keys starting at 0. Don't start from a big integer. Don't add keys that are not integers (e.g. don't use strings as keys).
  3. Try not to delete keys in the middle of arrays (don't delete the element at index 5 from an array with indices 0-9 filled in).
  4. Don't convert to and from dense and sparse arrays (i.e. don't repeatedly add and remove holes). There's an overhead for the engine to convert to and from the resizable array vs hash-table representations.

[C#列表上的JavaScript数组的缺点是,它们]每次添加新项时都会动态分配更多内存

The disadvantage of [JavaScript Arrays over C# Lists is that they] dynamically allocate more memory each time a new item is added

否,不一定.当JavaScript数组没有孔时,C#列表和JavaScript数组基本相同.两者都是可调整大小的数组.区别在于:

No, not necessarily. C# Lists and JavaScript Arrays are basically the same when the JavaScript array has no holes. Both are resizable arrays. The difference is that:

  1. C#列表使用户可以更好地控制可调整大小数组的行为.在JavaScript中,您无法控制它-它在引擎内部.
  2. C#列表允许用户预分配内存以获得更好的性能,而在JavaScript中,您应该让引擎自动计算出如何在底层可调整大小的数组中预分配内存以获得更好的性能.

这篇关于JavaScript数组在内部如何调整大小?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆