数组边界检查效率.NET 4及以上 [英] Array bounds check efficiency in .net 4 and above

查看:167
本文介绍了数组边界检查效率.NET 4及以上的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我感兴趣的是低层次的算法是如何高效可以在.NET。我想使我们能够选择写我们更多的在C#,而不是C ++在未来的代码,而是一个绊脚石是界限在与循环和阵列随机访问时.NET检查。

I'm interested in how efficient low-level algorithms can be in .net. I would like to enable us to choose to write more of our code in C# rather than C++ in the future, but one stumbling block is the bounds checking in .net that occurs with looping and random access to arrays.

一个激励的例子是,计算在两个阵列(在此是两个向量的点积)的相应元素的乘积之和的函数。

A motivating example is a function that calculates the sum of products of corresponding elements in two arrays (this is the dot product of two vectors).

static void SumProduct(double[] X, double[] Y)
{
    double sum = 0;
    int length = X.Length;
    if (length != Y.Length)
        throw new ArgumentException("X and Y must be same size");
    for (int i = 0; i < length; i++) // Check X.Length instead? See below
        sum += X[i] * Y[i];
}

这是我可以告诉,不知道够不够IL或x86到检查,编译器不会优化出界的检查X 。我错了和/或有写我的代码,让编译器帮我一条生路?

From what I can tell, and don't know enough IL or x86 to check, the compiler won't optimize out bounds checking of X and Y. Am I wrong and/or is there a way to write my code to allow the compiler to help me out?

更多详情

Further details

有很多的效率,赞成和反对使用特定的语言,并非最不重要的,最好是专注于大O算法的成本,而不是比例常数,与更高级别的语言帮助你做到这一点。在边界在.NET检查的问题,我找到了最好的文章是的数组边界检查消除在MSDN上的CLR (也处于的扶持优化)的重要性堆栈溢出的答案

There are many efficiency-arguments for and against using particular languages, not least that it is better to concentrate on "big O" algorithmic cost rather than the constant of proportionality, and higher level languages help you to do this. On the subject of bounds checking in .net, the best article I found is Array Bounds Check Elimination in the CLR on MSDN (also referenced in a stack overflow answer on the importance of enabling optimization).

这个日期从2009年,所以我不知道事情是否从那以后显著的变化。此外,文章揭示了就会抓住了我出这么为此独自一人我欢迎一些专家的意见一些真正的细微之处。

This dates from 2009, so I wonder whether things have changed significantly since then. Also, the article reveals some real subtleties that would have caught me out so for this reason alone I would welcome some expert advice.

例如它出现在我上面的代码我会过得更好写 I< X.Length ,而不是 I<长度。另外,我还天真地认为与单个阵列的算法,写出了的foreach 循环会更好你的意图声明编译器,并给它优化了最好的机会边界检查。

For example it appears that in my code above I would have better off writing i< X.Length rather than i < length. Also, I had also naively assumed that for an algorithm with a single array, writing a foreach loop would better declare your intent to the compiler and give it the best chance of optimizing out the bounds checking.

根据MSDN文章 SumForBAD ,下面,我认为是一定要进行优化, 将不会。而 SumFor 将直接被优化, SumForEach 也将优化,但不平凡(可能不进行优化所有如果数组被传递给函数如的IEnumerable< INT>

According to the MSDN article, SumForBAD, below, which I thought was sure to be optimized, would not be. Whereas SumFor would be straightforwardly optimized, and SumForEach would also be optimized, but not trivially (and might not be optimized at all if the array were passed into a function as IEnumerable<int>)?

static double SumForBAD(double[] X)
{
    double sum = 0;
    int length = X.Length; // better to use i < X.length in loop
    for (int i = 0; i < length; i++)
        sum += X[i];
    return sum;
}

static double SumFor(double[] X)
{
    double sum = 0;
    for (int i = 0; i < X.Length; i++)
        sum += X[i];
    return sum;
}

static double SumForEach(double[] X)
{
    double sum = 0;
    foreach (int element in X)
        sum += element;
    return sum;
}






我做了一些调查,根据在doug65536的回答。在C ++中,我比较,做一个SUMPRODUCT的时报女边界检查


I did some investigation based on doug65536's answer. In C++, I compared the times of a SumProduct that does one bounds-check

for(int i=0; i<n; ++i) sum += v1[i]*v2[i];



针对另一个版本做两界的检查

against another version that does two bounds-checks

for(int i=0; i<n1 && i <n2; ++i) sum += v1[i]*v2[i];



我发现第二个版本是慢,但只有约3.5%(Visual Studio 2010中,优化打造,默认选项)。然而,它发生,我认为在C#中,可能有三个边界检查。一个明确的( I<长度在函数静态无效SUMPRODUCT(双[] X,双[] Y)在这个问题的开始),和两个隐含的( X [I] Y [I] )。所以,我测试了第三个C ++函数,有三个边界检查

I found that the second version was slower, but only by about 3.5% (Visual Studio 2010, optimized build, default options). However it occurred to me that in C#, there might be three bounds checks. One explicit (i < length in the function static void SumProduct(double[] X, double[] Y) at the start of this question), and two implicit (X[i] and Y[i]). So I tested a third C++ function, with three bounds checks

for(int i=0; i<n1 && i <n2 && i <n3; ++i) sum += v1[i]*v2[i];

这排在比第一,这是值得我们在乎慢35%。我没有在这个问题上多一些调查,的环路为什么增加额外的检查使某些机器上很大的区别,而在其他小的差异?。有趣的是,它好像边界检查的费用在不同的机器显著变化。

This came in 35% slower than the first, which is worth caring about. I did some more investigation in this question, Why does adding extra check in loop make big difference on some machines, and small difference on others?. Interestingly, it seems as though the cost of bounds checking varies significantly on different machines.

推荐答案

的边界检查将不会因为关系

The bounds check won't matter because:


  • 的范围检查由一个 CMP / 指令对,这是融合成现代CPU架构的单个微操作(术语为宏指令融合)。比较和分支非常高度优化。

  • The bounds check consists of a cmp/jae instruction pair, which is fused into a single micro-op on modern CPU architectures (the term is "macro-op fusion"). Compare and branch is very highly optimized.

边界检查是正向的分支,这将是静态的预测是不采取,也降低了成本。分支将永远不会被采取。 (如果它曾经是采取一个异常无论如何都会抛出,因此错误预测成本变得完全无关)

The bounds check is a forward branch, which will be statically predicted to be not-taken, also reducing the cost. The branch will never be taken. (If it ever is taken, an exception will throw anyway, so the mispredict cost becomes utterly irrelevant)

只要有任何的内存延迟,推测执行将排队循环的多次迭代,所以额外的指令对解码的成本几乎消失了。

As soon as there is any memory delay, speculative execution will queue up many iterations of the loop, so the cost of decoding the extra instruction pair almost disappears.

内存访问可能是你的瓶颈,因此效果微的优化,如消除边界检查将消失。

Memory access will likely be your bottleneck, so the effect micro-optimizations like removing bounds checks will disappear.

这篇关于数组边界检查效率.NET 4及以上的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆