T> Func键与其中的性能;和继承 [英] Performance of Func<T> and inheritance

查看:161
本文介绍了T> Func键与其中的性能;和继承的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我一直有麻烦了解使用 Func键与其中的性能特点; ...> 在我的code。使用继承和泛型的时候 - 这是结合我发现自己使用的所有时间。

I've been having trouble with understanding the performance characteristics of using Func<...> throughout my code when using inheritance and generics - which is a combination I find myself using all the time.

让我开始用最少的测试案例让大家都知道我们在说什么,然后我会后的结果,然后我会解释我所期望的,为什么...

Let me start with a minimal test case so we all know what we're talking about, then I'll post the results and then I'm going to explain what I would expect and why...

最少测试用例

public class GenericsTest2 : GenericsTest<int> 
{
    static void Main(string[] args)
    {
        GenericsTest2 at = new GenericsTest2();

        at.test(at.func);
        at.test(at.Check);
        at.test(at.func2);
        at.test(at.Check2);
        at.test((a) => a.Equals(default(int)));
        Console.ReadLine();
    }

    public GenericsTest2()
    {
        func = func2 = (a) => Check(a);
    }

    protected Func<int, bool> func2;

    public bool Check2(int value)
    {
        return value.Equals(default(int));
    }

    public void test(Func<int, bool> func)
    {
        using (Stopwatch sw = new Stopwatch((ts) => { Console.WriteLine("Took {0:0.00}s", ts.TotalSeconds); }))
        {
            for (int i = 0; i < 100000000; ++i)
            {
                func(i);
            }
        }
    }
}

public class GenericsTest<T>
{
    public bool Check(T value)
    {
        return value.Equals(default(T));
    }

    protected Func<T, bool> func;
}

public class Stopwatch : IDisposable
{
    public Stopwatch(Action<TimeSpan> act)
    {
        this.act = act;
        this.start = DateTime.UtcNow;
    }

    private Action<TimeSpan> act;
    private DateTime start;

    public void Dispose()
    {
        act(DateTime.UtcNow.Subtract(start));
    }
}

结果

Took 2.50s  -> at.test(at.func);
Took 1.97s  -> at.test(at.Check);
Took 2.48s  -> at.test(at.func2);
Took 0.72s  -> at.test(at.Check2);
Took 0.81s  -> at.test((a) => a.Equals(default(int)));

我的期望,为什么

我会想到这个code以完全相同的速度运行,为所有5种方法,更precise,甚至快于任何这一点,即一样快:

I would have expect this code to run at exactly the same speed for all 5 methods, to be more precise, even faster than any of this, namely just as fast as:

using (Stopwatch sw = new Stopwatch((ts) => { Console.WriteLine("Took {0:0.00}s", ts.TotalSeconds); }))
{
    for (int i = 0; i < 100000000; ++i)
    {
        bool b = i.Equals(default(int));
    }
}
// this takes 0.32s ?!?

我希望它采取0.32s,因为我看不出有任何理由JIT编译器不内联code在这种特殊情况下。

I expected it to take 0.32s because I don't see any reason for the JIT compiler not to inline the code in this particular case.

在仔细检查,我不明白这些性能数字在所有:

On closer inspection, I don't understand these performance numbers at all:

  • at.func 被传递给函数,可以在执行过程中不被改变。为什么不是这个内联?
  • at.Check 是明显快于 at.Check2 ,同时兼具不能被重写和在IL 。检查类GenericsTest2的情况下被固定如磐石
  • 在我看不出有任何理由 Func键&LT; INT,BOOL&GT; 来传递一个内嵌的时候会比较慢 Func键而不是这会转换为一个方法 Func键
  • 为什么是测试用例2和3高达0.5秒,而案例4和5之间的区别是0.1S的区别 - 不,他们应该是相同
  • at.func is passed to the function and cannot be changed during execution. Why isn't this inlined?
  • at.Check is apparently faster than at.Check2, while both cannot be overridden and the IL of at.Check in the case of class GenericsTest2 is as fixed as a rock
  • I see no reason for Func<int, bool> to be slower when passing an inline Func instead of a method that's converted to a Func
  • And why is the difference between test case 2 and 3 a whopping 0.5s while the difference between case 4 and 5 is 0.1s - aren't they supposed to be the same?

问题

我真的很想明白这个...到底是怎么回事这里使用一个通用的基类是比内联一大堆高达10倍的慢?

I'd really like to understand this... what is going on here that using a generic base class is a whopping 10x slower than inlining the whole lot?

因此​​,基本上,问题是:为什么会这样,我怎么能解决这个问题。

So, basically the question is: why is this happening and how can I fix it?

更新

根据所有评论到目前为止(感谢!)我做了一些更多的挖掘。

Based on all the comments so far (thanks!) I did some more digging.

首先,一组新的结果反复试验,使循环5倍大,执行它们的4倍时。我已经使用了诊断秒表,并添加更多的测试(添加描述以及)。

First off, a new set of results when repeating the tests and making the loop 5x larger and executing them 4 times. I've used the Diagnostics stopwatch and added more tests (added description as well).

(Baseline implementation took 2.61s)

--- Run 0 ---
Took 3.00s for (a) => at.Check2(a)
Took 12.04s for Check3<int>
Took 12.51s for (a) => GenericsTest2.Check(a)
Took 13.74s for at.func
Took 16.07s for GenericsTest2.Check
Took 12.99s for at.func2
Took 1.47s for at.Check2
Took 2.31s for (a) => a.Equals(default(int))
--- Run 1 ---
Took 3.18s for (a) => at.Check2(a)
Took 13.29s for Check3<int>
Took 14.10s for (a) => GenericsTest2.Check(a)
Took 13.54s for at.func
Took 13.48s for GenericsTest2.Check
Took 13.89s for at.func2
Took 1.94s for at.Check2
Took 2.61s for (a) => a.Equals(default(int))
--- Run 2 ---
Took 3.18s for (a) => at.Check2(a)
Took 12.91s for Check3<int>
Took 15.20s for (a) => GenericsTest2.Check(a)
Took 12.90s for at.func
Took 13.79s for GenericsTest2.Check
Took 14.52s for at.func2
Took 2.02s for at.Check2
Took 2.67s for (a) => a.Equals(default(int))
--- Run 3 ---
Took 3.17s for (a) => at.Check2(a)
Took 12.69s for Check3<int>
Took 13.58s for (a) => GenericsTest2.Check(a)
Took 14.27s for at.func
Took 12.82s for GenericsTest2.Check
Took 14.03s for at.func2
Took 1.32s for at.Check2
Took 1.70s for (a) => a.Equals(default(int))

我从这些结果中注意到,你开始使用泛型的那一刻,它就会慢得多。挖多一点到IL我发现了非一般的实现:

I noticed from these results, that the moment you start using generics, it gets much slower. Digging a bit more into the IL I found for the non-generic implementation:

L_0000: ldarga.s 'value'
L_0002: ldc.i4.0 
L_0003: call instance bool [mscorlib]System.Int32::Equals(int32)
L_0008: ret 

和所有普通的实现:

L_0000: ldarga.s 'value'
L_0002: ldloca.s CS$0$0000
L_0004: initobj !T
L_000a: ldloc.0 
L_000b: box !T
L_0010: constrained. !T
L_0016: callvirt instance bool [mscorlib]System.Object::Equals(object)
L_001b: ret 

虽然大多数的这个可以优化的,我在这里假设 callvirt 可能是一个问题。

在试图使其更快我加了T:IEquatable约束到方法的定义。其结果是:

In an attempt to make it faster I added the 'T : IEquatable' constraint to the definition of the method. The result is:

L_0011: callvirt instance bool [mscorlib]System.IEquatable`1<!T>::Equals(!0)

虽然我现在明白更多关于性能(它可能不能内联,因为它创建了一个虚函数表查找),我仍然感到困惑:为什么不是简单地调用T ::等于?毕竟,我的的指定它会在那里......

While I understand more about the performance now (it probably cannot inline because it creates a vtable lookup), I'm still confused: Why doesn't it simply call T::Equals? After all, I do specify it will be there...

推荐答案

运行微基准测试始终为3次。第一个将触发JIT和排除这一可能性。检查第二和第三道是相等的。这给了:

Run micro benchmarks always 3 times. The first will trigger JIT and rule that out. Check if 2nd and 3rd runs are equal. This gives:

... run ...
Took 0.79s
Took 0.63s
Took 0.74s
Took 0.24s
Took 0.32s
... run ...
Took 0.73s
Took 0.63s
Took 0.73s
Took 0.24s
Took 0.33s
... run ...
Took 0.74s
Took 0.63s
Took 0.74s
Took 0.25s
Took 0.33s

func = func2 = (a) => Check(a);

增加了一个额外的函数调用。通过

adds an additional function call. Remove it by

FUNC = FUNC2 = this.Check;

func = func2 = this.Check;

给出:

... 1. run ...
Took 0.64s
Took 0.63s
Took 0.63s
Took 0.24s
Took 0.32s
... 2. run ...
Took 0.63s
Took 0.63s
Took 0.63s
Took 0.24s
Took 0.32s
... 3. run ...
Took 0.63s
Took 0.63s
Took 0.63s
Took 0.24s
Took 0.32s

这表明,1和2之间运行的(JIT?)效应消失,由于删除函数调用。 一3测试现在等于

This shows that the (JIT?) effect between 1. and 2. run disappeared due to removing the function call. First 3 tests are now equal.

在测试4和5中,编译器可以内联函数参数为void测试(Func键&LT;>),而在试验1〜3这将是一个很长的路要走编译器弄清楚他们是恒定的。有时是不容易从我们的codeR的角度看,像净和JIT的限制,从与从C ++做了一个二进制.NET程序的动态特性即将约束的编译器。以任何方式,它的功能是精氨酸,使差此处的内联。

In tests 4 and 5, the compiler can inline the function argument to void test(Func<>), while in tests 1 to 3 it would be a long way for the compiler to figure out they are constant. Sometimes there are constraints to the compiler that are not easy to see from our coder's perspective, like .Net and Jit constraints coming from the dynamic nature of .Net programs compared to a binary made from c++. In any way, it is the inlining of the function arg that makes the difference here.

4和5之间的差异? 那么,TEST5貌似编译器可以非常容易地内联函数。也许他建立了一个上下文封锁和解决更复杂一点不是必要的。没挖成MSIL搞清楚。

Difference between 4 and 5? Well, test5 looks like the compiler can very easily inline the function as well. Maybe he builds a context for closures and resolves it a bit more complex than needed. Did not dig into MSIL to figure out.

上面的测试与.net 4.5。这里用3.5,表明了编译器与内联更好的:

Tests above with .Net 4.5. Here with 3.5, demonstrating that the compiler got better with inlining:

... 1. run ...
Took 1.06s
Took 1.06s
Took 1.06s
Took 0.24s
Took 0.27s
... 2. run ...
Took 1.06s
Took 1.08s
Took 1.06s
Took 0.25s
Took 0.27s
... 3. run ...
Took 1.05s
Took 1.06s
Took 1.05s
Took 0.24s
Took 0.27s

和.Net 4:

and .Net 4:

... 1. run ...
Took 0.97s
Took 0.97s
Took 0.96s
Took 0.22s
Took 0.30s
... 2. run ...
Took 0.96s
Took 0.96s
Took 0.96s
Took 0.22s
Took 0.30s
... 3. run ...
Took 0.97s
Took 0.96s
Took 0.96s
Took 0.22s
Took 0.30s

现在改变GenericTest&LT;>到GenericTest !!

... 1. run ...
Took 0.28s
Took 0.24s
Took 0.24s
Took 0.24s
Took 0.27s
... 2. run ...
Took 0.24s
Took 0.24s
Took 0.24s
Took 0.24s
Took 0.27s
... 3. run ...
Took 0.25s
Took 0.25s
Took 0.25s
Took 0.24s
Took 0.27s

嗯,这是从C#编译器,类似于我有密封类,避免虚函数的调用遇到一个惊喜。也许埃里克利珀有一个字上?

删除继承聚集带来的性能了。我学会了从来没有使用继承,可以很很少,而且可以强烈建议你不要去,至少在这种情况下。 (这是我的务实解决这个qustion,没有论战意)。我使用的接口,一路艰难,他们没有携带性能损失。

Removing the inheritance to aggregation brings performance back. I learned to never use inheritance, ok very very rarely, and can highly recommend you to avoid it at least in this case. (This is my pragmatic solution to this qustion, no flamewars intended). I use interfaces all the way tough, and they carry no performance penalties.

这篇关于T&GT; Func键与其中的性能;和继承的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆