什么更快,在字符串上切换还是在类型上切换? [英] What is quicker, switch on string or elseif on type?

查看:14
本文介绍了什么更快,在字符串上切换还是在类型上切换?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我可以选择根据字符串比较或其他类型确定要采用的代码路径:

哪个更快,为什么?

switch(childNode.Name){案例鲍勃":休息;案例吉尔":休息;案例马尔科":休息;}if(childNode 是 Bob){}elseif(childNode 是 Jill){}else if(childNode 是 Marko){}

更新:我问这个问题的主要原因是因为 switch 语句对于什么算作个案是很清楚的.例如,它不允许您使用变量,只能使用移动到主程序集的常量.我认为它有这个限制是因为它正在做一些时髦的事情.如果它只是转换为 elseifs(正如一位海报评论的那样),那么为什么我们不允许在 case 语句中使用变量?

警告:我正在进行后期优化.此方法在应用的缓慢部分被多次调用.

解决方案

Greg 的分析结果非常适合他所涵盖的确切场景,但有趣的是,当考虑到许多不同的因素(包括被比较的类型数量,以及底层数据中的相对频率和任何模式.

简单的答案是,没有人可以告诉您在您的特定场景中会有什么性能差异,您需要自己在自己的系统中以不同的方式衡量性能以获得准确的答案.

If/Else 链是用于少量类型比较的有效方法,或者如果您可以可靠地预测哪些类型将构成您看到的大部分类型.这种方法的潜在问题是,随着类型数量的增加,必须执行的比较次数也会增加.

如果我执行以下操作:

int value = 25124;如果(值 == 0) ...否则如果(值== 1)...否则如果(值 == 2)......否则如果(值 == 25124)...

在输入正确的块之前,必须评估前面的每个 if 条件.另一方面

switch(value) {案例0:...中断;情况 1:...中断;情况 2:...中断;...案例 25124:...中断;}

将执行一个简单的跳转到正确的代码位.

在您的示例中变得更复杂的地方是您的另一种方法使用字符串上的开关而不是整数,这会变得更复杂一些.在低级别上,字符串不能像整数值那样被打开,所以 C# 编译器会做一些魔法来让你工作.

如果 switch 语句足够小"(编译器会自动执行它认为最好的操作),则切换字符串会生成与 if/else 链相同的代码.

switch(someString) {case "Foo": DoFoo();休息;案例酒吧":DoBar();休息;默认值:DoOther;休息;}

等同于:

if(someString == "Foo") {DoFoo();} else if(someString == "Bar") {DoBar();} 别的 {DoOther();}

一旦字典中的项目列表足够大",编译器将自动创建一个内部字典,该字典从开关中的字符串映射到整数索引,然后基于该索引的开关.

它看起来像这样(想象一下比我要费心输入的条目更多)

静态字段定义在隐藏"位置,该位置与包含 Dictionary<string, int> 类型的 switch 语句的类相关联,并被赋予一个错位名称

//确保字典已加载if(theDictionary == null) {//为了清楚起见简化了,实际实现更复杂//为了保证线程安全theDictionary = new Dictionary();theDictionary["Foo"] = 0;theDictionary["Bar"] = 1;}int switchIndex;if(theDictionary.TryGetValue(someString, out switchIndex)) {开关(开关索引){案例 0: DoFoo();休息;情况一:DoBar();休息;}} 别的 {DoOther();}

在我刚刚运行的一些快速测试中,If/Else 方法大约是 3 种不同类型(其中类型随机分布)的 switch 方法的 3 倍.在 25 种类型时,切换速度略快 (16%),而在 50 种类型时,切换速度快两倍多.

如果您要切换大量类型,我建议使用第 3 种方法:

私有委托 void NodeHandler(ChildNode node);静态字典TypeHandleSwitcher = CreateSwitcher();private static Dictionary创建切换器(){var ret = new Dictionary();ret[typeof(Bob).TypeHandle] = HandleBob;ret[typeof(Jill).TypeHandle] = HandleJill;ret[typeof(Marko).TypeHandle] = HandleMarko;返回 ret;}void HandleChildNode(ChildNode 节点){NodeHandler 处理程序;if (TaskHandleSwitcher.TryGetValue(Type.GetRuntimeType(node), out handler)){处理程序(节点);}别的{//意外的类型...}}

这与 Ted Elliot 建议的类似,但使用运行时类型句柄而不是完整类型对象避免了通过反射加载类型对象的开销.

这是我机器上的一些快速计时:

<前>使用 5,000,000 个数据元素(模式=随机)和 5 种类型测试 3 次迭代最佳方法时间百分比如果/否则 179.67 100.00类型句柄字典 321.33 178.85类型字典 377.67 210.20开关 492.67 274.21使用 5,000,000 个数据元素(模式=随机)和 10 种类型测试 3 次迭代最佳方法时间百分比如果/否则 271.33 100.00类型句柄字典 312.00 114.99类型字典 374.33 137.96开关 490.33 180.71使用 5,000,000 个数据元素(模式=随机)和 15 种类型测试 3 次迭代最佳方法时间百分比类型句柄字典 312.00 100.00如果/否则 369.00 118.27类型字典 371.67 119.12开关 491.67 157.59使用 5,000,000 个数据元素(模式=随机)和 20 种类型测试 3 次迭代最佳方法时间百分比类型句柄字典 335.33 100.00类型字典 373.00 111.23如果/否则 462.67 137.97开关 490.33 146.22使用 5,000,000 个数据元素(模式=随机)和 25 种类型测试 3 次迭代最佳方法时间百分比类型句柄字典 319.33 100.00类型字典 371.00 116.18交换机 483.00 151.25如果/否则 562.00 175.99使用 5,000,000 个数据元素(模式=随机)和 50 种类型测试 3 次迭代最佳方法时间百分比类型句柄字典 319.67 100.00类型字典 376.67 117.83开关 453.33 141.81如果/否则 1,032.67 323.04

至少在我的机器上,当分布超过 15 种不同类型时,类型句柄字典方法击败了所有其他方法用作方法输入的类型是随机的.

另一方面,如果输入完全由 if/else 链中首先检查的类型组成,则该方法要快得多:

<前>使用 5,000,000 个数据元素(模式 = UniformFirst)和 50 种类型测试 3 次迭代最佳方法时间百分比如果/否则 39.00 100.00类型句柄字典 317.33 813.68类型字典 396.00 1,015.38交换机 403.00 1,033.33

相反,如果输入始终是 if/else 链中的最后一件事,则会产生相反的效果:

<前>使用 5,000,000 个数据元素(模式=UniformLast)和 50 种类型测试 3 次迭代最佳方法时间百分比类型句柄字典 317.67 100.00开关 354.33 111.54类型字典 377.67 118.89如果/否则 1,907.67 600.52

如果您可以对您的输入做出一些假设,那么您可能会从混合方法中获得最佳性能,在这种方法中,您对少数最常见的类型执行 if/else 检查,然后在出现以下情况时回退到字典驱动的方法那些失败了.

Lets say I have the option of identifying a code path to take on the basis of a string comparison or else iffing the type:

Which is quicker and why?

switch(childNode.Name)
{
    case "Bob":
      break;
    case "Jill":
      break;
    case "Marko":
      break;
}

if(childNode is Bob)
{
}
elseif(childNode is Jill)
{
}
else if(childNode is Marko)
{
}

Update: The main reason I ask this is because the switch statement is perculiar about what counts as a case. For example it wont allow you to use variables, only constants which get moved to the main assembly. I assumed it had this restriction due to some funky stuff it was doing. If it is only translating to elseifs (as one poster commented) then why are we not allowed variables in case statements?

Caveat: I am post-optimising. This method is called many times in a slow part of the app.

解决方案

Greg's profile results are great for the exact scenario he covered, but interestingly, the relative costs of the different methods change dramatically when considering a number of different factors including the number of types being compared, and the relative frequency and any patterns in the underlying data.

The simple answer is that nobody can tell you what the performance difference is going to be in your specific scenario, you will need to measure the performance in different ways yourself in your own system to get an accurate answer.

The If/Else chain is an effective approach for a small number of type comparisons, or if you can reliably predict which few types are going to make up the majority of the ones that you see. The potential problem with the approach is that as the number of types increases, the number of comparisons that must be executed increases as well.

if I execute the following:

int value = 25124;
if(value == 0) ...
else if (value == 1) ...
else if (value == 2) ...
...
else if (value == 25124) ... 

each of the previous if conditions must be evaluated before the correct block is entered. On the other hand

switch(value) {
 case 0:...break;
 case 1:...break;
 case 2:...break;
 ...
 case 25124:...break;
}

will perform one simple jump to the correct bit of code.

Where it gets more complicated in your example is that your other method uses a switch on strings rather than integers which gets a little more complicated. At a low level, strings can't be switched on in the same way that integer values can so the C# compiler does some magic to make this work for you.

If the switch statement is "small enough" (where the compiler does what it thinks is best automatically) switching on strings generates code that is the same as an if/else chain.

switch(someString) {
    case "Foo": DoFoo(); break;
    case "Bar": DoBar(); break;
    default: DoOther; break;
}

is the same as:

if(someString == "Foo") {
    DoFoo();
} else if(someString == "Bar") {
    DoBar();
} else {
    DoOther();
}

Once the list of items in the dictionary gets "big enough" the compiler will automatically create an internal dictionary that maps from the strings in the switch to an integer index and then a switch based on that index.

It looks something like this (Just imagine more entries than I am going to bother to type)

A static field is defined in a "hidden" location that is associated with the class containing the switch statement of type Dictionary<string, int> and given a mangled name

//Make sure the dictionary is loaded
if(theDictionary == null) { 
    //This is simplified for clarity, the actual implementation is more complex 
    // in order to ensure thread safety
    theDictionary = new Dictionary<string,int>();
    theDictionary["Foo"] = 0;
    theDictionary["Bar"] = 1;
}

int switchIndex;
if(theDictionary.TryGetValue(someString, out switchIndex)) {
    switch(switchIndex) {
    case 0: DoFoo(); break;
    case 1: DoBar(); break;
    }
} else {
    DoOther();
}

In some quick tests that I just ran, the If/Else method is about 3x as fast as the switch for 3 different types (where the types are randomly distributed). At 25 types the switch is faster by a small margin (16%) at 50 types the switch is more than twice as fast.

If you are going to be switching on a large number of types, I would suggest a 3rd method:

private delegate void NodeHandler(ChildNode node);

static Dictionary<RuntimeTypeHandle, NodeHandler> TypeHandleSwitcher = CreateSwitcher();

private static Dictionary<RuntimeTypeHandle, NodeHandler> CreateSwitcher()
{
    var ret = new Dictionary<RuntimeTypeHandle, NodeHandler>();

    ret[typeof(Bob).TypeHandle] = HandleBob;
    ret[typeof(Jill).TypeHandle] = HandleJill;
    ret[typeof(Marko).TypeHandle] = HandleMarko;

    return ret;
}

void HandleChildNode(ChildNode node)
{
    NodeHandler handler;
    if (TaskHandleSwitcher.TryGetValue(Type.GetRuntimeType(node), out handler))
    {
        handler(node);
    }
    else
    {
        //Unexpected type...
    }
}

This is similar to what Ted Elliot suggested, but the usage of runtime type handles instead of full type objects avoids the overhead of loading the type object through reflection.

Here are some quick timings on my machine:

Testing 3 iterations with 5,000,000 data elements (mode=Random) and 5 types
Method                Time    % of optimal
If/Else               179.67  100.00
TypeHandleDictionary  321.33  178.85
TypeDictionary        377.67  210.20
Switch                492.67  274.21

Testing 3 iterations with 5,000,000 data elements (mode=Random) and 10 types
Method                Time    % of optimal
If/Else               271.33  100.00
TypeHandleDictionary  312.00  114.99
TypeDictionary        374.33  137.96
Switch                490.33  180.71

Testing 3 iterations with 5,000,000 data elements (mode=Random) and 15 types
Method                Time    % of optimal
TypeHandleDictionary  312.00  100.00
If/Else               369.00  118.27
TypeDictionary        371.67  119.12
Switch                491.67  157.59

Testing 3 iterations with 5,000,000 data elements (mode=Random) and 20 types
Method                Time    % of optimal
TypeHandleDictionary  335.33  100.00
TypeDictionary        373.00  111.23
If/Else               462.67  137.97
Switch                490.33  146.22

Testing 3 iterations with 5,000,000 data elements (mode=Random) and 25 types
Method                Time    % of optimal
TypeHandleDictionary  319.33  100.00
TypeDictionary        371.00  116.18
Switch                483.00  151.25
If/Else               562.00  175.99

Testing 3 iterations with 5,000,000 data elements (mode=Random) and 50 types
Method                Time      % of optimal
TypeHandleDictionary  319.67    100.00
TypeDictionary        376.67    117.83
Switch                453.33    141.81
If/Else               1,032.67  323.04

On my machine at least, the type handle dictionary approach beats all of the others for anything over 15 different types when the distribution of the types used as input to the method is random.

If on the other hand, the input is composed entirely of the type that is checked first in the if/else chain that method is much faster:

Testing 3 iterations with 5,000,000 data elements (mode=UniformFirst) and 50 types
Method                Time    % of optimal
If/Else               39.00   100.00
TypeHandleDictionary  317.33  813.68
TypeDictionary        396.00  1,015.38
Switch                403.00  1,033.33

Conversely, if the input is always the last thing in the if/else chain, it has the opposite effect:

Testing 3 iterations with 5,000,000 data elements (mode=UniformLast) and 50 types
Method                Time      % of optimal
TypeHandleDictionary  317.67    100.00
Switch                354.33    111.54
TypeDictionary        377.67    118.89
If/Else               1,907.67  600.52

If you can make some assumptions about your input, you might get the best performance from a hybrid approach where you perform if/else checks for the few types that are most common, and then fall back to a dictionary-driven approach if those fail.

这篇关于什么更快,在字符串上切换还是在类型上切换?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆