在C#中的数组复制任何更快的方法? [英] Any faster way of copying arrays in C#?

查看:174
本文介绍了在C#中的数组复制任何更快的方法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有需要在一个三维阵列组合三个数组。下面code显示性能资源管理器的性能降低。是否有一个更快的解决方案?

 的for(int i = 0; I< sortedIndex.Length;我++){
    如果(ⅰ&下; num_in_left)
    {
        //添加实例的左子
        leftnode [I,0] = sortedIndex [I];
        leftnode [I,1] = sortedInstances [I];
        leftnode [Ⅰ,2] = sortedLabels [I];
    }
    其他
    {
        //添加实例的右子
        rightnode [I-num_in_left,0] = sortedIndex [I];
        rightnode [I-num_in_left,1] = sortedInstances [I]
        rightnode [I-num_in_left,2] = sortedLabels [I];
    }
}

更新:

实际上,我想做到以下几点:

  //有三次一维数组
双[] sortedIndex,sortedInstances,sortedLabels;
//请将它们复制到3D阵列(忘了rightnode现在)
双[] = leftnode新的双[sortedIndex.Length,3];
//一些魔术发生在这里,使
leftnode = {sortedIndex,sortedInstances,sortedLabels};


解决方案

使用缓冲区。 BlockCopy 的。它的整个目的是进行快速(见缓冲):


  

本类比System.Array类相似的方法操作基本类型提供更好的性能。


诚然,我没有做任何的基准,但它的文档。它也适用于多维数组;只是确保你总是指定多少的字节的复制,没有多少元素,也是你工作的基本数组上。

另外,我没有测试过这一点,但你的可能的能挤出更多的表现出来的系统,如果你绑定一个委托 System.Buffer.memcpyimpl 并直接调用。签名是:

 内部静态不安全无效memcpyimpl(BYTE * SRC,BYTE * DEST,INT LEN)

这确实需要三分球,但我相信它的最高速度优化可能的,所以我不认为有任何的方式来获得比快,即使你手头有装配。


更新

由于请求(以满足我的好奇心),我测试了这一点:

 使用系统;
使用System.Diagnostics程序;
使用的System.Reflection;不安全的委托无效MemCpyImpl(BYTE * SRC,BYTE * DEST,INT LEN);静态类温度
{
    //有真应了通用createDelegate方法< T>()方法... -___-
    静态MemCpyImpl memcpyimpl =(MemCpyImpl)Delegate.CreateDelegate(
        typeof运算(MemCpyImpl)的typeof(缓冲).GetMethod(memcpyimpl
            BindingFlags.Static | BindingFlags.NonPublic可));
    const int的COUNT = 32,SIZE = 32 LT;< 20;    //使用不同的缓冲区,以避免CPU缓存的影响
    静态的byte []
        aSource =新的字节[SIZE],aTarget =新的字节[SIZE]
        bSource =新的字节[SIZE],bTarget =新的字节[SIZE]
        cSource =新的字节[SIZE],cTarget =新的字节[SIZE]
    静态不安全无效TestUnsafe()
    {
        秒表SW = Stopwatch.StartNew();
        固定(字节* PSRC = aSource)
        固定(字节* pDest = aTarget)
            的for(int i = 0; I< COUNT;我++)
                memcpyimpl(PSRC,pDest,SIZE);
        sw.Stop();
        Console.WriteLine(Buffer.memcpyimpl:{0:N0}蜱,sw.ElapsedTicks);
    }    静态无效TestBlockCopy()
    {
        秒表SW = Stopwatch.StartNew();
        sw.Start();
        的for(int i = 0; I< COUNT;我++)
            Buffer.BlockCopy(bSource,0,bTarget,0,大小);
        sw.Stop();
        Console.WriteLine(Buffer.BlockCopy:{0:N0}蜱,
            sw.ElapsedTicks);
    }    静态无效TestArrayCopy()
    {
        秒表SW = Stopwatch.StartNew();
        sw.Start();
        的for(int i = 0; I< COUNT;我++)
            Array.Copy(cSource,0,cTarget,0,大小);
        sw.Stop();
        Console.WriteLine(Array.Copy:{0:N0}蜱,sw.ElapsedTicks);
    }    静态无效的主要(字串[] args)
    {
        的for(int i = 0;我小于10;我++)
        {
            TestArrayCopy();
            TestBlockCopy();
            TestUnsafe();
            Console.WriteLine();
        }
    }
}

结果:

  Buffer.BlockCopy:469151蜱
Array.Copy:469972蜱
Buffer.memcpyimpl:496541蜱Buffer.BlockCopy:421011蜱
Array.Copy:430694蜱
Buffer.memcpyimpl:410933蜱Buffer.BlockCopy:425112蜱
Array.Copy:420839蜱
Buffer.memcpyimpl:411520蜱Buffer.BlockCopy:424329蜱
Array.Copy:420288蜱
Buffer.memcpyimpl:405598蜱Buffer.BlockCopy:422410蜱
Array.Copy:427826蜱
Buffer.memcpyimpl:414394蜱

现在改变顺序:

  Array.Copy:419,750蜱
Buffer.memcpyimpl:408919蜱
Buffer.BlockCopy:419774蜱Array.Copy:430529蜱
Buffer.memcpyimpl:412148蜱
Buffer.BlockCopy:424900蜱Array.Copy:424706蜱
Buffer.memcpyimpl:427861蜱
Buffer.BlockCopy:421929蜱Array.Copy:420556蜱
Buffer.memcpyimpl:421541蜱
Buffer.BlockCopy:436430蜱Array.Copy:435297蜱
Buffer.memcpyimpl:432505蜱
Buffer.BlockCopy:441493蜱

现在再次更改顺序:

  Buffer.memcpyimpl:430874蜱
Buffer.BlockCopy:429730蜱
Array.Copy:432746蜱Buffer.memcpyimpl:415943蜱
Buffer.BlockCopy:423809蜱
Array.Copy:428703蜱Buffer.memcpyimpl:421270蜱
Buffer.BlockCopy:428262蜱
Array.Copy:434940蜱Buffer.memcpyimpl:423506蜱
Buffer.BlockCopy:427220蜱
Array.Copy:431606蜱Buffer.memcpyimpl:422900蜱
Buffer.BlockCopy:439280蜱
Array.Copy:432649蜱

或者,换句话说:他们是非常有竞争力;作为一般规则, memcpyimpl 是最快的,但它并不一定值得担心。

I have three arrays that need to be combined in one three-dimension array. The following code shows slow performance in Performance Explorer. Is there a faster solution?

for (int i = 0; i < sortedIndex.Length; i++) {
    if (i < num_in_left)
    {    
        // add instance to the left child
        leftnode[i, 0] = sortedIndex[i];
        leftnode[i, 1] = sortedInstances[i];
        leftnode[i, 2] = sortedLabels[i];
    }
    else
    { 
        // add instance to the right child
        rightnode[i-num_in_left, 0] = sortedIndex[i];
        rightnode[i-num_in_left, 1] = sortedInstances[i];
        rightnode[i-num_in_left, 2] = sortedLabels[i];
    }                    
}

Update:

I'm actually trying to do the following:

//given three 1d arrays
double[] sortedIndex, sortedInstances, sortedLabels;
// copy them over to a 3d array (forget about the rightnode for now)
double[] leftnode = new double[sortedIndex.Length, 3];
// some magic happens here so that
leftnode = {sortedIndex, sortedInstances, sortedLabels};

解决方案

Use Buffer.BlockCopy. Its entire purpose is to perform fast (see Buffer):

This class provides better performance for manipulating primitive types than similar methods in the System.Array class.

Admittedly, I haven't done any benchmarks, but that's the documentation. It also works on multidimensional arrays; just make sure that you're always specifying how many bytes to copy, not how many elements, and also that you're working on a primitive array.

Also, I have not tested this, but you might be able to squeeze a bit more performance out of the system if you bind a delegate to System.Buffer.memcpyimpl and call that directly. The signature is:

internal static unsafe void memcpyimpl(byte* src, byte* dest, int len)

It does require pointers, but I believe it's optimized for the highest speed possible, and so I don't think there's any way to get faster than that, even if you had assembly at hand.


Update:

Due to requests (and to satisfy my curiosity), I tested this:

using System;
using System.Diagnostics;
using System.Reflection;

unsafe delegate void MemCpyImpl(byte* src, byte* dest, int len);

static class Temp
{
    //There really should be a generic CreateDelegate<T>() method... -___-
    static MemCpyImpl memcpyimpl = (MemCpyImpl)Delegate.CreateDelegate(
        typeof(MemCpyImpl), typeof(Buffer).GetMethod("memcpyimpl",
            BindingFlags.Static | BindingFlags.NonPublic));
    const int COUNT = 32, SIZE = 32 << 20;

    //Use different buffers to help avoid CPU cache effects
    static byte[]
        aSource = new byte[SIZE], aTarget = new byte[SIZE],
        bSource = new byte[SIZE], bTarget = new byte[SIZE],
        cSource = new byte[SIZE], cTarget = new byte[SIZE];


    static unsafe void TestUnsafe()
    {
        Stopwatch sw = Stopwatch.StartNew();
        fixed (byte* pSrc = aSource)
        fixed (byte* pDest = aTarget)
            for (int i = 0; i < COUNT; i++)
                memcpyimpl(pSrc, pDest, SIZE);
        sw.Stop();
        Console.WriteLine("Buffer.memcpyimpl: {0:N0} ticks", sw.ElapsedTicks);
    }

    static void TestBlockCopy()
    {
        Stopwatch sw = Stopwatch.StartNew();
        sw.Start();
        for (int i = 0; i < COUNT; i++)
            Buffer.BlockCopy(bSource, 0, bTarget, 0, SIZE);
        sw.Stop();
        Console.WriteLine("Buffer.BlockCopy: {0:N0} ticks",
            sw.ElapsedTicks);
    }

    static void TestArrayCopy()
    {
        Stopwatch sw = Stopwatch.StartNew();
        sw.Start();
        for (int i = 0; i < COUNT; i++)
            Array.Copy(cSource, 0, cTarget, 0, SIZE);
        sw.Stop();
        Console.WriteLine("Array.Copy: {0:N0} ticks", sw.ElapsedTicks);
    }

    static void Main(string[] args)
    {
        for (int i = 0; i < 10; i++)
        {
            TestArrayCopy();
            TestBlockCopy();
            TestUnsafe();
            Console.WriteLine();
        }
    }
}

The results:

Buffer.BlockCopy: 469,151 ticks
Array.Copy: 469,972 ticks
Buffer.memcpyimpl: 496,541 ticks

Buffer.BlockCopy: 421,011 ticks
Array.Copy: 430,694 ticks
Buffer.memcpyimpl: 410,933 ticks

Buffer.BlockCopy: 425,112 ticks
Array.Copy: 420,839 ticks
Buffer.memcpyimpl: 411,520 ticks

Buffer.BlockCopy: 424,329 ticks
Array.Copy: 420,288 ticks
Buffer.memcpyimpl: 405,598 ticks

Buffer.BlockCopy: 422,410 ticks
Array.Copy: 427,826 ticks
Buffer.memcpyimpl: 414,394 ticks

Now change the order:

Array.Copy: 419,750 ticks
Buffer.memcpyimpl: 408,919 ticks
Buffer.BlockCopy: 419,774 ticks

Array.Copy: 430,529 ticks
Buffer.memcpyimpl: 412,148 ticks
Buffer.BlockCopy: 424,900 ticks

Array.Copy: 424,706 ticks
Buffer.memcpyimpl: 427,861 ticks
Buffer.BlockCopy: 421,929 ticks

Array.Copy: 420,556 ticks
Buffer.memcpyimpl: 421,541 ticks
Buffer.BlockCopy: 436,430 ticks

Array.Copy: 435,297 ticks
Buffer.memcpyimpl: 432,505 ticks
Buffer.BlockCopy: 441,493 ticks

Now change the order again:

Buffer.memcpyimpl: 430,874 ticks
Buffer.BlockCopy: 429,730 ticks
Array.Copy: 432,746 ticks

Buffer.memcpyimpl: 415,943 ticks
Buffer.BlockCopy: 423,809 ticks
Array.Copy: 428,703 ticks

Buffer.memcpyimpl: 421,270 ticks
Buffer.BlockCopy: 428,262 ticks
Array.Copy: 434,940 ticks

Buffer.memcpyimpl: 423,506 ticks
Buffer.BlockCopy: 427,220 ticks
Array.Copy: 431,606 ticks

Buffer.memcpyimpl: 422,900 ticks
Buffer.BlockCopy: 439,280 ticks
Array.Copy: 432,649 ticks

or, in other words: they're very competitive; as a general rule, memcpyimpl is fastest, but it's not necessarily worth worrying about.

这篇关于在C#中的数组复制任何更快的方法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆