C#编译器或JIT可以优化掉在lambda表达式的方法调用? [英] Can the C# compiler or JIT optimize away a method call in a lambda expression?

查看:324
本文介绍了C#编译器或JIT可以优化掉在lambda表达式的方法调用?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我开始它开始(在评论 )另一个计算器问题的讨论后,这个问题,我米兴趣知道答案。考虑下面的表达式:

I'm starting this question after a discussion which started (in comments) on another StackOverflow question, and I'm intrigued to know the answer. Considering the following expression:

var objects = RequestObjects.Where(r => r.RequestDate > ListOfDates.Max());



会不会有移动 ListOfDates的评价任何(性能)​​的优势。 MAX()出Where子句在这种情况下,还是会1.编译或2 JIT优化这一走?

Will there be any (performance) advantage of moving the evaluation of ListOfDates.Max() out of the Where clause in this case, or will 1. the compiler or 2. JIT optimize this away?

我相信C#只会在编译的时候做的常量折叠,而且可以说,ListOfDates.Max()不能在编译时,除非ListOfDates本身是某种恒定的是已知的。

I believe C# will only do constant folding at compile time, and it could be argued that ListOfDates.Max() can not be known at compile time unless ListOfDates itself is somehow constant.

也许还有另外一个编译器(或JIT)优化,使得确保此只计算一次?

Perhaps there is another compiler (or JIT) optimization that makes sure that this is only evaluated once?

推荐答案

好了,这是一个有点一个复杂的答案。

Well, it's a bit of a complex answer.

有这里涉及两件事情。 (1)编译器和(2)的JIT。

There are two things involved here. (1) the compiler and (2) the JIT.

编译

简而言之,编译器只是将你的C#代码IL代码。这是一个很琐碎的翻译在大多数情况下和.NET的核心理念之一是,每个函数编译为IL代码的自主块。

Simply put, the compiler just translates your C# code to IL code. It's a pretty trivial translation for most cases and one of the core ideas of .NET is that each function is compiled as an autonomous block of IL code.

所以,不要指望从C#太多 - > IL编译

So, don't expect too much from the C# -> IL compiler.

的JIT

这是......有点复杂。

That's... a bit more complicated.

JIT编译器基本上会将您的IL代码汇编。 JIT编译器还包含一个基于SSA优化。然而,有一个时间限制,因为我们不想等待太久之前,我们的代码开始运行。基本上,这意味着JIT编译器不能做所有的超级酷的东西,这将使你的代码去非常快,只是因为这会花费太多的时间。

The JIT compiler basically translates your IL code to assembler. The JIT compiler also contains an SSA based optimizer. However, there's a time limit, because we don't want to wait too long before our code starts to run. Basically this means that the JIT compiler doesn't do all the super cool stuff that will make your code go extremely fast, simply because that would cost too much time.

我们当然可以只把它放到测试:)确保VS当你运行将优化(选项 - >调试 - >不要选中禁用[...],只是我的代码),编译x64的发布模式,把一个断点,看什么当你切换到视图汇编发生

We can of course just put it to the test :) Ensure VS will optimize when you run (options -> debugger -> uncheck suppress [...] and just my code), compile in x64 release mode, put a breakpoint and see what happens when you switch to assembler view.

但是,嘿,什么是在只具有理论上的乐趣;让我们把它放到测试。 :)

But hey, what's the fun in only having theory; let's put it to the test. :)

static bool Foo(Func<int, int, int> foo, int a, int b)
{
    return foo(a, b) > 0;  // put breakpoint on this line.
}

public static void Test()
{
    int n = 2;
    int m = 2;
    if (Foo((a, b) => a + b, n, m)) 
    {
        Console.WriteLine("yeah");
    }
}



你应该注意到的第一件事是,断点被击中。这已经告诉方法并不联;如果是这样,你就不会遇到断点都没有。

First thing you should notice is that the breakpoint is hit. This already tells that the method ain't inlined; if it were, you wouldn't hit the breakpoint at all.

接下来,如果你看汇编输出,你会使用地址注意到一个呼叫的说明。这里是你的函数。仔细观察,你会发现,它调用了委托。

Next, if you watch the assembler output, you'll notice a 'call' instructions using an address. Here's your function. On closer inspection, you'll notice that it's calling the delegate.

现在,基本上这意味着呼叫没有内联,因此不优化匹配本地(方法)上下文。换句话说,不使用的代表,并把东西在你的方法可能比使用委托更快。

Now, basically this means that the call is not inlined, and therefore is not optimized to match the local (method) context. In other words, not using delegates and putting stuff in your method is probably faster than using delegates.

在另一方面,调用的的相当有效的。基本上,函数指针简单地传递,并要求。有没有虚函数表查找,只是一个简单的电话。这意味着它可能调用节拍的成员(如IL callvirt )。尽管如此,静态调用(IL 呼叫)应得更快,因为这些都是可预见的编译时间。再次,让我们的测试,好吗?

On the other hand, the call is pretty efficient. Basically the function pointer is simply passed and called. There's no vtable lookup, just a simple call. This means it probably beats calling a member (e.g. IL callvirt). Still, static calls (IL call) should be even faster, since these are predictable compile-time. Again, let's test, shall we?

public static void Test()
{
    ISummer summer = new Summer();
    Stopwatch sw = Stopwatch.StartNew();
    int n = 0;
    for (int i = 0; i < 1000000000; ++i)
    {
        n = summer.Sum(n, i);
    }
    Console.WriteLine("Vtable call took {0} ms, result = {1}", sw.ElapsedMilliseconds, n);

    Summer summer2 = new Summer();
    sw = Stopwatch.StartNew();
    n = 0;
    for (int i = 0; i < 1000000000; ++i)
    {
        n = summer.Sum(n, i);
    }
    Console.WriteLine("Non-vtable call took {0} ms, result = {1}", sw.ElapsedMilliseconds, n);

    Func<int, int, int> sumdel = (a, b) => a + b;
    sw = Stopwatch.StartNew();
    n = 0;
    for (int i = 0; i < 1000000000; ++i)
    {
        n = sumdel(n, i);
    }
    Console.WriteLine("Delegate call took {0} ms, result = {1}", sw.ElapsedMilliseconds, n);

    sw = Stopwatch.StartNew();
    n = 0;
    for (int i = 0; i < 1000000000; ++i)
    {
        n = Sum(n, i);
    }
    Console.WriteLine("Static call took {0} ms, result = {1}", sw.ElapsedMilliseconds, n);
}



结果:

Results:

Vtable call took 2714 ms, result = -1243309312
Non-vtable call took 2558 ms, result = -1243309312
Delegate call took 1904 ms, result = -1243309312
Static call took 324 ms, result = -1243309312

在这里,有趣的东西实际上是最新的测试结果。请记住,静态调用(IL 呼叫)是完全确定的。这意味着它的优化编译器的相对简单的事情。如果检查汇编输出,你会发现,调用琛实际上是内联。这是有道理的。其实,如果你想测试它,只是把代码中的方法是一样快的静态调用。

The thing here that's interesting is actually the latest test result. Remember that static calls (IL call) are completely deterministic. That means it's a relatively simple thing to optimize for the compiler. If you inspect the assembler output, you'll find that the call to Sum is actually inlined. This makes sense. Actually, if you would test it, just putting the code in the method is just as fast as the static call.

约等于一个小型的评论

如果您测量哈希表的性能,有些事情似乎腥跟我解释。它显示为,如果 IEquatable< T> 使事情更快。

If you measure performance of hash tables, something seems fishy with my explanation. It appears as-if IEquatable<T> makes things go faster.

好,这是真的。 :-)哈希容器使用 IEquatable< T> 调用等于。现在,大家都知道,对象都实现等于(对象o)。因此,容器可以叫等于(对象)等于(T)。电话本身的性能是一样的。

Well, that's actually true. :-) Hash containers use IEquatable<T> to call Equals. Now, as we all know, objects all implement Equals(object o). So, the containers can either call Equals(object) or Equals(T). The performance of the call itself is the same.

然而,如果你还实施 IEquatable< T> ,实施通常是这样的:

However, if you also implement IEquatable<T>, the implementation usually looks like this:

bool Equals(object o)
{
    var obj = o as MyType;
    return obj != null && this.Equals(obj);
}



此外,如果的MyType 是一个结构,运行时也需要申请装箱和拆箱。如果它只是叫 IEquatable< T> ,所有这些措施是必要的。所以,尽管它似乎慢,这无关与呼叫本身。

Furthermore, if MyType is a struct, the runtime also needs to apply boxing and unboxing. If it would just call IEquatable<T>, none of these steps would be necessary. So, even though it appears slower, this has nothing to do with the call itself.

您的问题

会不会有在这种情况下出Where子句的移动ListOfDates.Max()的评价任何(性能)​​的优势,还是会1编译或2。 JIT优化这个了?

Will there be any (performance) advantage of moving the evaluation of ListOfDates.Max() out of the Where clause in this case, or will 1. the compiler or 2. JIT optimize this away?

是的,会有一个优势。编译器/ JIT不会优化它了。

Yes, there will be an advantage. The compiler / JIT won't optimize it away.

我相信C#将只在编译的时候做的常量折叠,而且可以说该ListOfDates.Max()不能在编译时,除非ListOfDates本身是某种恒定的是已知的。

I believe C# will only do constant folding at compile time, and it could be argued that ListOfDates.Max() can not be known at compile time unless ListOfDates itself is somehow constant.

其实,如果你改变静态来电 N = 2 +总和(N,2)你会发现,汇编输出将包含 4 。这证明了JIT优化器做的常量折叠。 (这是很明显的实际,如果你知道如何优化SSA工作......常量折叠和简化的叫了几声)。

Actually, if you change the static call to n = 2 + Sum(n, 2) you'll notice that the assembler output will contain a 4. Which proves that the JIT optimizer does do constant folding. (Which is quite obvious actually if you know about how SSA optimizers work... const folding and simplification are called a few times).

函数指针本身并没有进行优化。这可能是在未来,尽管。

The function pointer itself isn't optimized. It might be in the future though.

也许还有就是确保另一个编译器(或JIT)优化,这是只计算一次?

Perhaps there is another compiler (or JIT) optimization that makes sure that this is only evaluated once?

至于另一个编译',如果你愿意加入另一种语言,你可以使用C ++。在C ++中,这些类型的呼叫有时优化掉

As for 'another compiler', if you're willing to add 'another language', you can use C++. In C++ these kinds of calls are sometimes optimized away.

更有趣的是,锵基于LLVM,并且有用于LLVM几个C#编译器为好。我相信,单有一个选项,以优化到LLVM,并CoreCLR正在研究LLILC。虽然我没有测试过这一点,LLVM绝对可以做这些种类的优化。

More interestingly, Clang is based on LLVM, and there are a few C# compilers for LLVM as well. I believe Mono has an option to optimize to LLVM, and CoreCLR was working on LLILC. While I haven't tested this, LLVM can definitely do these kinds of optimizations.

这篇关于C#编译器或JIT可以优化掉在lambda表达式的方法调用?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆