C# &.NET:堆栈分配 [英] C# & .NET: stackalloc

查看:14
本文介绍了C# &.NET:堆栈分配的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对 stackalloc 运算符的功能有一些疑问.

I have a few questions about the functionality of the stackalloc operator.

  1. 它实际上是如何分配的?我认为它的作用类似于:

  1. How does it actually allocate? I thought it does something like:

void* stackalloc(int sizeInBytes)
{
    void* p = StackPointer (esp);
    StackPointer += sizeInBytes;
    if(StackPointer exceeds stack size)
        throw new StackOverflowException(...);
    return p;
}

但我已经做了一些测试,但我不确定它是如何工作的.我们无法确切知道它的作用以及它是如何做到的,但我想知道基本知识.

But I have done a few tests, and I'm not sure that's how it work. We can't know exactly what it does and how it does it, but I want to know the basics.

我认为堆栈分配(嗯,我实际上确定)比堆分配更快.那么为什么会出现这个例子:

I thought that stack allocation (Well, I am actually sure about it) is faster than heap allocation. So why does this example:

 class Program
 {
     static void Main(string[] args)
     {
         Stopwatch sw1 = new Stopwatch();
         sw1.Start();
         StackAllocation();
         Console.WriteLine(sw1.ElapsedTicks);

         Stopwatch sw2 = new Stopwatch();
         sw2.Start();
         HeapAllocation();
         Console.WriteLine(sw2.ElapsedTicks);
     }
     static unsafe void StackAllocation()
     {
         for (int i = 0; i < 100; i++)
         {
             int* p = stackalloc int[100];
         }
     }
     static void HeapAllocation()
     {
         for (int i = 0; i < 100; i++)
         {
             int[] a = new int[100];
         }
     }
 }

给出 280~ 滴答作堆栈分配的平均结果,通常 1-0 滴答作堆分配?(在我的个人计算机上,Intel Core i7).

gives the average results of 280~ ticks for stack allocation, and usually 1-0 ticks for heap allocation? (On my personal computer, Intel Core i7).

在我现在使用的计算机(Intel Core 2 Duo)上,结果比之前的更有意义(可能是因为 优化代码 没有在 VS 中检查):堆栈分配 460~ 个滴答声,而堆分配约 380 个滴答声.

On the computer I am using now (Intel Core 2 Duo), the results make more sense that the previous ones (Probably because Optimize code was not checked in VS): 460~ ticks for stack allocation, and about 380 ticks for heap allocation.

但这仍然没有意义.为什么会这样?我猜 CLR 注意到我们没有使用数组,所以它甚至没有分配它?

But this still doesn't make sense. Why is it so? I guess that the CLR notices that we don't use the array, so maybe it doesn't even allocate it?

推荐答案

stackalloc更快的一个案例:

A case where stackalloc is faster:

 private static volatile int _dummy; // just to avoid any optimisations
                                         // that have us measuring the wrong
                                         // thing. Especially since the difference
                                         // is more noticable in a release build
                                         // (also more noticable on a multi-core
                                         // machine than single- or dual-core).
 static void Main(string[] args)
 {
     System.Diagnostics.Stopwatch sw1 = new System.Diagnostics.Stopwatch();
     Thread[] threads = new Thread[20];
     sw1.Start();
     for(int t = 0; t != 20; ++t)
     {
        threads[t] = new Thread(DoSA);
        threads[t].Start();
     }
     for(int t = 0; t != 20; ++t)
        threads[t].Join();
     Console.WriteLine(sw1.ElapsedTicks);

     System.Diagnostics.Stopwatch sw2 = new System.Diagnostics.Stopwatch();
     threads = new Thread[20];
     sw2.Start();
     for(int t = 0; t != 20; ++t)
     {
        threads[t] = new Thread(DoHA);
        threads[t].Start();
     }
     for(int t = 0; t != 20; ++t)
        threads[t].Join();
     Console.WriteLine(sw2.ElapsedTicks);
     Console.Read();
 }
 private static void DoSA()
 {
    Random rnd = new Random(1);
    for(int i = 0; i != 100000; ++i)
        StackAllocation(rnd);
 }
 static unsafe void StackAllocation(Random rnd)
 {
    int size = rnd.Next(1024, 131072);
    int* p = stackalloc int[size];
    _dummy = *(p + rnd.Next(0, size));
 }
 private static void DoHA()
 {
    Random rnd = new Random(1);
    for(int i = 0; i != 100000; ++i)
        HeapAllocation(rnd);
 }
 static void HeapAllocation(Random rnd)
 {
    int size = rnd.Next(1024, 131072);
    int[] a = new int[size];
    _dummy = a[rnd.Next(0, size)];
 }

此代码与问题中的代码之间的重要区别:

Important differences between this code and that in the question:

  1. 我们有几个线程正在运行.通过堆栈分配,它们在自己的堆栈中进行分配.通过堆分配,它们从与其他线程共享的堆中进行分配.

  1. We have several threads running. With stack allocation, they are allocating in their own stack. With heap allocation, they are allocating from a heap shared with other threads.

分配了更大的尺寸.

每次分配不同的大小(尽管我植入了随机生成器以使测试更具确定性).这使得堆碎片更有可能发生,使得堆分配的效率低于每次都相同的分配.

Different sizes allocated each time (though I seeded the random generator to make the tests more deterministic). This makes heap fragmentation more likely to happen, making heap allocation less efficient than with identical allocations each time.

除此之外,还值得注意的是,stackalloc 通常用作使用 fixed 将数组固定在堆上的替代方法.固定数组对堆性能不利(不仅对于该代码,而且对于使用相同堆的其他线程),因此如果声明的内存将在任何合理的时间内使用,那么性能影响会更大.

As well as this, it's also worth noting that stackalloc would often be used as an alternative to using fixed to pin an array on the heap. Pinning arrays is bad for heap performance (not just for that code, but also for other threads using the same heap), so the performance impact would be even greater then, if the claimed memory would be in use for any reasonable length of time.

虽然我的代码演示了 stackalloc 提供了性能优势的情况,但在这个问题中可能更接近大多数情况下有人可能会急切地通过使用它来优化".希望这两段代码一起显示整个 stackalloc 可以提高性能,也可以大大降低性能.

While my code demonstrates a case where stackalloc gives a performance benefit, that in the question is probably closer to most cases where someone might eagerly "optimise" by using it. Hopefully the two pieces of code together show that whole stackalloc can give a boost, it can also hurt performance a lot too.

通常,您甚至不应该考虑 stackalloc,除非您无论如何都需要使用固定内存与非托管代码进行交互,并且应该将其视为 fixed 的替代方案code> 而不是一般堆分配的替代方案.在这种情况下使用仍然需要谨慎,在开始之前进行深思熟虑,并在完成之后进行分析.

Generally, you shouldn't even consider stackalloc unless you are going to need to use pinned memory for interacting with unmanaged code anyway, and it should be considered an alternative to fixed rather than an alternative to general heap allocation. Use in this case still requires caution, forethought before you start, and profiling after you finish.

在其他情况下使用可能会带来好处,但它应该远远低于您将尝试的性能改进列表.

Use in other cases could give a benefit, but it should be far down the list of performance improvements you would try.

回答问题的第 1 部分.Stackalloc 在概念上与您描述的差不多.它获取堆栈内存的一块,然后返回一个指向该块的指针.它不会检查内存是否适合这样,而是如果它尝试将内存获取到堆栈的末尾(在创建线程时受 .NET 保护),那么这将导致操作系统向运行时返回异常,然后它变成一个 .NET 托管异常.如果您只是在具有无限递归的方法中分配单个字节,则会发生同样的情况 - 除非调用经过优化以避免堆栈分配(有时可能),否则单个字节最终将加起来足以触发堆栈溢出异常.

To answer part 1 of the question. Stackalloc is conceptually much as you describe. It obtains a chunk of the stack memory, and then returns a pointer to that chunk. It doesn't check the memory will fit as such, but rather if it attempts to obtain memory into the end of the stack - which is protected by .NET on thread creation - then this will cause the OS to return an exceptioin to the runtime, which it then turns into a .NET managed exception. Much the same happens if you just allocate a single byte in a method with infinite recursion - unless the call got optimised to avoid that stack allocation (sometimes possible), then a single byte will eventually add up to enough to trigger the stack overflow exception.

这篇关于C# &amp;.NET:堆栈分配的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆