为什么 .NET 中的多维数组比普通数组慢? [英] Why are multi-dimensional arrays in .NET slower than normal arrays?

查看:52
本文介绍了为什么 .NET 中的多维数组比普通数组慢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我向大家道歉.当我实际上想说多维数组"时,我使用了术语锯齿状数组"(如下面的示例所示).我为使用不正确的名称道歉.我实际上发现锯齿状数组比多维数组更快!我已经添加了对锯齿状阵列的测量.

我今天尝试使用 jagged 多维数组,但发现它的性能不如我预期的那样.使用一维数组并手动计算索引比使用二维数组快得多(几乎两倍).我使用 1024*1024 数组(初始化为随机值)编写了一个测试,迭代 1000 次,在我的机器上得到以下结果:

sum(double[], int): 2738 ms (100%)sum(double[,]): 5019 毫秒 (183%)总和(双 [][]):2540 毫秒(93%)

这是我的测试代码:

public static double sum(double[] d, int l1) {//假设数组是矩形的双和 = 0;int l2 = d.Length/l1;for (int i = 0; i 

进一步调查表明,第二种方法的 IL 比第一种方法的 IL 大 23%.(代码大小 68 与 52.)这主要是由于调用了 System.Array::GetLength(int).编译器还为 jagged 多维数组发出对 Array::Get 的调用,而它只是为简单数组调用 ldelem.

所以我想知道,为什么通过多维数组访问比普通数组慢?我会假设编译器(或 JIT)会做一些类似于我在我的第一种方法中所做的事情,但事实并非如此.

你能帮我理解为什么会这样吗?

<小时>

更新:根据 Henk Holterman 的建议,这里是 TestTime 的实现:

public static void TestTime(Func action, T obj,int 迭代){秒表 秒表 = Stopwatch.StartNew();for (int i = 0; i <迭代次数; ++i)动作(对象);Console.WriteLine(action.Method.Name + " take " + 秒表.Elapsed);}public static void TestTime(Func action, T1 obj1,T2 obj2,int 迭代){秒表 秒表 = Stopwatch.StartNew();for (int i = 0; i <迭代次数; ++i)动作(对象1,对象2);Console.WriteLine(action.Method.Name + " take " + 秒表.Elapsed);}

解决方案

下界为 0 的单维数组与 IL 中的多维或非 0 下界数组是不同的类型 (vector vs array IIRC).vector 使用起来更简单 - 要获取元素 x,您只需执行 pointer + size * x.对于array,你必须对一维数组做pointer + size * (x-lower bound),并且对你添加的每个维度做更多的算术.>

基本上,CLR 针对更常见的情况进行了优化.

Edit: I apologize everybody. I used the term "jagged array" when I actually meant to say "multi-dimensional array" (as can be seen in my example below). I apologize for using the incorrect name. I actually found jagged arrays to be faster than multi-dimensional ones! I have added my measurements for jagged arrays.

I was trying to use a jagged multi-dimensional array today, when I noticed that it's performance is not as I would have expected. Using a single-dimensional array and manually calculating indices was much faster (almost two times) than using a 2D array. I wrote a test using 1024*1024 arrays (initialized to random values), for 1000 iterations, and I got the following results on my machine:

sum(double[], int): 2738 ms (100%)
sum(double[,]):     5019 ms (183%)
sum(double[][]):    2540 ms ( 93%)

This is my test code:

public static double sum(double[] d, int l1) {
    // assuming the array is rectangular
    double sum = 0;
    int l2 = d.Length / l1;
    for (int i = 0; i < l1; ++i)
        for (int j = 0; j < l2; ++j)
            sum += d[i * l2 + j];
    return sum;
}

public static double sum(double[,] d) {
    double sum = 0;
    int l1 = d.GetLength(0);
    int l2 = d.GetLength(1);
    for (int i = 0; i < l1; ++i)
        for (int j = 0; j < l2; ++j)
            sum += d[i, j];
    return sum;
}

public static double sum(double[][] d) {
    double sum = 0;
    for (int i = 0; i < d.Length; ++i)
        for (int j = 0; j < d[i].Length; ++j)
            sum += d[i][j];
    return sum;
}

public static void Main() {
    Random random = new Random();
    const int l1  = 1024, l2 = 1024;
    double[ ] d1  = new double[l1 * l2];
    double[,] d2  = new double[l1 , l2];
    double[][] d3 = new double[l1][];

    for (int i = 0; i < l1; ++i) {
        d3[i] = new double[l2];
        for (int j = 0; j < l2; ++j)
            d3[i][j] = d2[i, j] = d1[i * l2 + j] = random.NextDouble();
    }
    //
    const int iterations = 1000;
    TestTime(sum, d1, l1, iterations);
    TestTime(sum, d2, iterations);
    TestTime(sum, d3, iterations);
}

Further investigation showed that the IL for the second method is 23% larger than that of the first method. (Code size 68 vs 52.) This is mostly due to calls to System.Array::GetLength(int). The compiler also emits calls to Array::Get for the jagged multi-dimensional array, whereas it simply calls ldelem for the simple array.

So I am wondering, why is access through multi-dimensional arrays slower than normal arrays? I would have assumed the compiler (or JIT) would do something similar to what I did in my first method, but this was not actually the case.

Could you plese help me understand why this is happening the way it is?


Update: Following Henk Holterman's suggestion, here is the implementation of TestTime:

public static void TestTime<T, TR>(Func<T, TR> action, T obj,
                                   int iterations)
{
    Stopwatch stopwatch = Stopwatch.StartNew();
    for (int i = 0; i < iterations; ++i)
        action(obj);
    Console.WriteLine(action.Method.Name + " took " + stopwatch.Elapsed);
}

public static void TestTime<T1, T2, TR>(Func<T1, T2, TR> action, T1 obj1,
                                        T2 obj2, int iterations)
{
    Stopwatch stopwatch = Stopwatch.StartNew();
    for (int i = 0; i < iterations; ++i)
        action(obj1, obj2);
    Console.WriteLine(action.Method.Name + " took " + stopwatch.Elapsed);
}

解决方案

Single dimensional arrays with a lower bound of 0 are a different type to either multi-dimensional or non-0 lower bound arrays within IL (vector vs array IIRC). vector is simpler to work with - to get to element x, you just do pointer + size * x. For an array, you have to do pointer + size * (x-lower bound) for a single dimensional array, and yet more arithmetic for each dimension you add.

Basically the CLR is optimised for the vastly more common case.

这篇关于为什么 .NET 中的多维数组比普通数组慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆