一个.NET数组的开销? [英] Overhead of a .NET array?

查看:144
本文介绍了一个.NET数组的开销?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我是想用这个code,以确定一个.NET数组(在32位进程)在头的开销:

I was trying to determine the overhead of the header on a .NET array (in a 32-bit process) using this code:

long bytes1 = GC.GetTotalMemory(false);
object[] array = new object[10000];
    for (int i = 0; i < 10000; i++)
        array[i] = new int[1];
long bytes2 = GC.GetTotalMemory(false);
array[0] = null; // ensure no garbage collection before this point

Console.WriteLine(bytes2 - bytes1);
// Calculate array overhead in bytes by subtracting the size of 
// the array elements (40000 for object[10000] and 4 for each 
// array), and dividing by the number of arrays (10001)
Console.WriteLine("Array overhead: {0:0.000}", 
                  ((double)(bytes2 - bytes1) - 40000) / 10001 - 4);
Console.Write("Press any key to continue...");
Console.ReadKey();

结果是

    204800
    Array overhead: 12.478

在一个32位的过程中,对象[1]应该是相同的大小与int [1],但实际上的开销跳跃3.28字节

In a 32-bit process, object[1] should be the same size as int[1], but in fact the overhead jumps by 3.28 bytes to

    237568
    Array overhead: 15.755

谁知道为什么?

Anyone know why?

(顺便说一句,如果任何人的好奇,开销对于非数组对象,如(对象)我在上面的循环,大约是8个字节(8.384)。我听说这是16个字节的64位进程。)

(By the way, if anyone's curious, the overhead for non-array objects, e.g. (object)i in the loop above, is about 8 bytes (8.384). I heard it's 16 bytes in 64-bit processes.)

推荐答案

下面是一个稍微整洁(IMO)短,但完整的程序表现出同样的事情:

Here's a slightly neater (IMO) short but complete program to demonstrate the same thing:

using System;

class Test
{
    const int Size = 100000;

    static void Main()
    {
        object[] array = new object[Size];
        long initialMemory = GC.GetTotalMemory(true);
        for (int i = 0; i < Size; i++)
        {
            array[i] = new string[0];
        }
        long finalMemory = GC.GetTotalMemory(true);
        GC.KeepAlive(array);

        long total = finalMemory - initialMemory;

        Console.WriteLine("Size of each element: {0:0.000} bytes",
                          ((double)total) / Size);
    }
}

不过,我得到相同的结果 - 的开销任何引用类型数组为16个字节,而开销的任何值类型的数组是12个字节。我还在试图找出为什么会这样,与CLI规范的帮助。不要忘了,引用类型数组是协变的,这可能是相关的...

But I get the same results - the overhead for any reference type array is 16 bytes, whereas the overhead for any value type array is 12 bytes. I'm still trying to work out why that is, with the help of the CLI spec. Don't forget that reference type arrays are covariant, which may be relevant...

编辑:随着Cordbg中的帮助下,我可以证实布莱恩的回答 - 引用类型数组的指针类型是相同的,不管实际的元素类型。 presumably有一些funkiness object.GetType()(即非虚,记不清了)考虑到这一点。

With the help of cordbg, I can confirm Brian's answer - the type pointer of a reference-type array is the same regardless of the actual element type. Presumably there's some funkiness in object.GetType() (which is non-virtual, remember) to account for this.

所以,用$ C $的C:

So, with code of:

object[] x = new object[1];
string[] y = new string[1];
int[] z = new int[1];
z[0] = 0x12345678;
lock(z) {}

我们结束了类似如下:

Variables:
x=(0x1f228c8) <System.Object[]>
y=(0x1f228dc) <System.String[]>
z=(0x1f228f0) <System.Int32[]>

Memory:
0x1f228c4: 00000000 003284dc 00000001 00326d54 00000000 // Data for x
0x1f228d8: 00000000 003284dc 00000001 00329134 00000000 // Data for y
0x1f228ec: 00000000 00d443fc 00000001 12345678 // Data for z

请注意,我的扔了存储1个字的变量本身的价值。

Note that I've dumped the memory 1 word before the value of the variable itself.

有关 X ,值:

  • 同步块,用于锁定散列code(或瘦锁的 - 看到布莱恩的评论)
  • 类型指针
  • 数组大小
  • 元素类型的指针
  • 在空引用(第一个元素)
  • The sync block, used for locking the hash code (or a thin lock - see Brian's comment)
  • Type pointer
  • Size of array
  • Element type pointer
  • Null reference (first element)

有关以Z ,值:

  • 同步块
  • 类型指针
  • 数组大小
  • 为0x12345678(第一个元素)

不同值型阵列(字节[],INT []等)结束与不同类型的指针,而所有引用类型的阵列使用相同类型的指针,但有不同的元件类型的指针。元素类型指针的值相同,你会发现作为指针类型为类型的对象。因此,如果我们看在上面跑一个字符串对象的内存,这将有0x00329134一个类型的指针。

Different value type arrays (byte[], int[] etc) end up with different type pointers, whereas all reference type arrays use the same type pointer, but have a different element type pointer. The element type pointer is the same value as you'd find as the type pointer for an object of that type. So if we looked at a string object's memory in the above run, it would have a type pointer of 0x00329134.

该类型的指针之前,这个词肯定有的东西的做液晶显示屏或哈希code:调用 GetHash code()填充的内存位,我相信默认 object.GetHash code()获得一个同步块,以确保哈希code的唯一性的对象的生存期。然而,只是在做锁(X){} 没做什么,这让我吃惊......

The word before the type pointer certainly has something to do with either the monitor or the hash code: calling GetHashCode() populates that bit of memory, and I believe the default object.GetHashCode() obtains a sync block to ensure hash code uniqueness for the lifetime of the object. However, just doing lock(x){} didn't do anything, which surprised me...

这一切只适用于载体类型的,顺便说一句 - 在CLR,一个载体类型是一个下限为0。其他阵列一维数组将有一个不同的布局 - 一件事,他们会需要下界存储...

All of this is only valid for "vector" types, by the way - in the CLR, a "vector" type is a single-dimensional array with a lower-bound of 0. Other arrays will have a different layout - for one thing, they'd need the lower bound stored...

到目前为止,已经试验,但这里的猜测 - 为正在实施的有系统的方式的原因。从这里开始,我真的只是猜测。

So far this has been experimentation, but here's the guesswork - the reason for the system being implemented the way it has. From here on, I really am just guessing.

  • 所有 [对象] 阵列可以共享相同的JIT code。他们将有同样的表现方式的内存分配,数组访问,长度属性和GC的引用(重要)的布局上。相比之下,与值类型的阵列,其中不同的值类型可能有不同的GC脚印(例如,一个可能具有一个字节,然后参考,其他人将有完全没有引用,等等)。
  • 您分配一个的对象[]内的值,每当运行时需要检查它是有效的。它需要检查其参考您正在使用新元素的值对象的类型与数组的元素类型兼容。例如:

  • All object[] arrays can share the same JIT code. They're going to behave the same way in terms of memory allocation, array access, Length property and (importantly) the layout of references for the GC. Compare that with value type arrays, where different value types may have different GC "footprints" (e.g. one might have a byte and then a reference, others will have no references at all, etc).
  • Every time you assign a value within an object[] the runtime needs to check that it's valid. It needs to check that the type of the object whose reference you're using for the new element value is compatible with the element type of the array. For instance:

object[] x = new object[1];
object[] y = new string[1];
x[0] = new object(); // Valid
y[0] = new object(); // Invalid - will throw an exception

这是我在前面提到的协方差。现在,因为这是会发生的的的每一个任务的,这是有道理的,以减少迂回的数量。特别是,我怀疑你不是真的想要通过让去的类型对象的每个assigment获得元素类型吹缓存。我的犯罪嫌疑人的(和我的x86汇编不够好,以验证这一点),测试是一样的东西:

This is the covariance I mentioned earlier. Now given that this is going to happen for every single assignment, it makes sense to reduce the number of indirections. In particular, I suspect you don't really want to blow the cache by having to go to the type object for each assigment to get the element type. I suspect (and my x86 assembly isn't good enough to verify this) that the test is something like:

  • 的价值是要复制一个空引用?如果是这样,那很好。 (完成)
  • 获取对象的类型的指针的基准点。
  • 是类型的指针一样的元素类型的指针(简单的二进制平等检查)?如果是这样,那很好。 (完成)
  • 是类型的指针赋值兼容的元素类型的指针? (更复杂的检查,继承和接口参与。)如果是这样,那很好 - 否则,抛出一个异常

如果我们可以在第一三个步骤终止搜索,有没有很多间接的 - 这是很好的东西,那将经​​常发生,因为数组赋值。这一切都不需要发生的值类型的任务,因为这是静态地核实的。

If we can terminate the search in the first three steps, there's not a lot of indirection - which is good for something that's going to happen as often as array assignments. None of this needs to happen for value type assignments, because that's statically verifiable.

所以,这就是为什么我相信引用类型的数组是不是值类型数组稍微大一点。

So, that's why I believe reference type arrays are slightly bigger than value type arrays.

大问题 - 真的很有趣,钻研吧:)

Great question - really interesting to delve into it :)

这篇关于一个.NET数组的开销?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆