.NET中通用和非通用集合之间的内存使用差异 [英] Memory usage difference between Generic and Non-generic collections in .NET

查看:104
本文介绍了.NET中通用和非通用集合之间的内存使用差异的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如今,我在.NET中阅读了有关收藏集的信息.众所周知,使用通用集合优于非通用:它们是类型安全的,没有强制转换,也没有装箱/拆箱.这就是为什么通用集合具有性能优势的原因.

I read about collections in .NET nowadays. As known, there is some advantages using generic collections over non-generic: they are type-safety and there is no casting, no boxing/unboxing. That's why generic collections have a performance advantage.

如果我们认为非泛型集合将每个成员存储为object,那么我们可以认为泛型还具有内存优势.但是,我没有找到有关内存使用差异的任何信息.

If we consider that non-generic collections store every member as object, then we can think that generics have also memory advantage. However, I didn't found any information about memory usage difference.

任何人都可以澄清这一点吗?

Can anyone clarify about the point?

推荐答案

如果我们认为非泛型集合将每个成员都存储为对象,那么我们可以认为泛型还具有内存优势.但是,我没有找到有关内存使用差异的任何信息.任何人都可以澄清这一点吗?

If we consider that non-generic collections store every member as object, then we can think that generics have also memory advantage. However, I didn't found any information about memory usage difference. Can anyone clarify about the point?

好的.让我们考虑一个包含int s和List<int>ArrayList.假设每个列表中有1000个int.

Sure. Let's consider an ArrayList that contains ints vs a List<int>. Let's suppose there are 1000 ints in each list.

在这两种方法中,集合类型都是数组的薄包装-因此称为ArrayList.对于ArrayList,存在一个基础object[],其中至少包含1000个装箱的整数.对于List<int>,存在一个基础int[],其中至少包含1000个int.

In both, the collection type is a thin wrapper around an array -- hence the name ArrayList. In the case of ArrayList, there's an underlying object[] that contains at least 1000 boxed ints. In the case of List<int>, there's an underlying int[] that contains at least 1000 ints.

为什么我说至少"?因为两者都使用双倍充值策略.如果在创建集合时设置集合的容量,则它会为很多东西分配足够的空间.如果您不这样做,则该集合必须猜测,如果它猜错了并且您需要更多的容量,则它的容量将增加一倍.因此,最好的情况是,我们的集合数组大小恰到好处.最坏的情况是,它们的大小可能是所需大小的两倍.数组中可能有2000个对象或2000个整数的空间.

Why did I say "at least"? Because both use a double-when-full strategy. If you set the capacity of a collection when you create it then it allocates enough space for that many things. If you don't, then the collection has to guess, and if it guesses wrong and you need more capacity, then it doubles its capacity. So, best case, our collection arrays are exactly the right size. Worst case, they are possibly twice as big as they need to be; there could be room for 2000 objects or 2000 ints in the arrays.

但是为了简单起见,我们假设我们很幸运,每个中大约有1000个.

But let's suppose for simplicity that we're lucky and there are about 1000 in each.

首先,仅阵列的内存负担是多少? object[1000]在32位系统上占用4000字节,在64位系统上占用8000字节,仅用于指针大小的引用. int[1000]占用4000个字节. (数组记账还占用了一些额外的字节,但是与边际成本相比,这些成本很小.)

To start with, what's the memory burden of just the array? An object[1000] takes up 4000 bytes on a 32 bit system and 8000 bytes on a 64 bit system, just for the references, which are pointer sized. An int[1000] takes up 4000 bytes regardless. (There are also a few extra bytes taken up by array bookkeeping, but these costs are small compared to the marginal costs.)

因此,我们已经看到非通用解决方案可能只为阵列消耗两倍的内存.数组的内容呢?

So already we see that the non-generic solution possibly consumes twice as much memory just for the array. What about the contents of the array?

好吧,关于值类型的事情是它们存储在它们自己的变量中.除了用于存储1000个整数的4000字节之外,没有其他空间.他们被打包到阵列中.因此,对于一般情况,附加成本为零.

Well, the thing about value types is they are stored right there in their own variable. There is no additional space beyond those 4000 bytes used to store the 1000 integers; they get packed right into the array. So the additional cost is zero for the generic case.

对于object[]情况,数组的每个成员都是一个引用,并且该引用引用一个对象;在这种情况下,是装箱的整数.一个装箱的整数的大小是多少?

For the object[] case, each member of the array is a reference, and that reference refers to an object; in this case, a boxed integer. What's the size of a boxed integer?

未装箱的值类型不需要存储有关其类型的任何信息,因为其类型由其所在的存储类型决定,而运行时已知该类型.装箱的值类型需要在框中将事物的类型存储在某处,这会占用空间.事实证明,在32位.NET中,对象的簿记开销为8字节,在64位系统上为16.那只是开销.我们当然需要4个字节的int.但是,等待,情况变得更糟:在64位系统上,该框必须对齐8字节边界,因此我们在64位系统上需要另一个 4字节的填充.

An unboxed value type doesn't need to store any information about its type, because its type is determined by the type of the storage its in, and that's known to the runtime. A boxed value type needs to somewhere store the type of the thing in the box, and that takes space. It turns out that the bookkeeping overhead for an object in 32 bit .NET is 8 bytes, and 16 on 64 bit systems. That's just the overhead; we of course need 4 bytes for the int. But wait, it gets worse: on 64 bit systems, the box must be aligned to an 8 byte boundary, so we need another 4 bytes of padding on 64 bit systems.

全部添加:我们的int[]在64位和32位系统上大约需要4KB.我们的object[]包含1000个整数,在32位系统上大约需要16KB,在64位系统上大约需要32K.因此,对于非一般情况,int[]object[]的内存效率相比要差4或8倍.

Add it all up: Our int[] takes about 4KB on both 64 and 32 bit systems. Our object[] containing 1000 ints takes about 16KB on 32 bit systems, and 32K on 64 bit systems. So the memory efficiency of an int[] vs an object[] is either 4 or 8 times worse for the non-generic case.

但是等等,情况变得更糟.只是大小而已.那访问时间呢?

But wait, it gets even worse. That's just size. What about access time?

要从整数数组访问整数,运行时必须:

To access an integer from an array of integers, the runtime must:

  • 验证数组是否有效
  • 验证索引是否有效
  • 从给定索引处的变量中获取值

要从盒装整数数组中访问整数,运行时必须:

To access an integer from an array of boxed integers, the runtime must:

  • 验证数组是否有效
  • 验证索引是否有效
  • 从给定索引处的变量中获取引用
  • 验证引用不为空
  • 验证引用是否为盒装整数
  • 从框中提取整数

这需要更多步骤,因此需要更长的时间.

That's a lot more steps, so it takes a lot longer.

但要担心.

现代处理器使用芯片本身的高速缓存来避免返回主内存.由1000个纯整数组成的数组很有可能最终在高速缓存中结束,因此快速连续访问数组的第一个,第二个,第三个等成员都从同一条缓存行中拉出.这是极快的.但是,装箱的整数可以遍及整个堆,这会增加高速缓存未命中的次数,从而进一步大大降低访问速度.

Modern processors use caches on the chip itself to avoid going back to main memory. An array of 1000 plain integers is highly likely to end up in the cache so that accesses to the first, second, third, etc, members of the array in quick succession are all pulled from the same cache line; this is insanely fast. But boxed integers can be all over the heap, which increases cache misses, which greatly slows down access even further.

希望这足以阐明您对拳击处罚的理解.

Hopefully that sufficiently clarifies your understanding of the boxing penalty.

非盒装类型呢?字符串数组列表和List<string>之间是否有显着差异?

What about non-boxed types? Is there a significant difference between an array list of strings, and a List<string>?

由于object[]string[]具有相似的性能特征和内存布局,因此损失要小得多.在这种情况下,唯一的额外损失是(1)在运行时才捕获您的错误;(2)使代码更难以阅读和编辑;(3)进行运行时类型检查的轻微损失.

Here the penalty is much, much smaller, since an object[] and a string[] have similar performance characteristics and memory layouts. The only additional penalty in this case is (1) not catching your bugs until runtime, (2) making the code harder to read and edit, and (3) the slight penalty of a run-time type check.

这篇关于.NET中通用和非通用集合之间的内存使用差异的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆