为什么要将String编译切换为两个开关 [英] Why switch on String compiles into two switches

查看:114
本文介绍了为什么要将String编译切换为两个开关的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在编译交换机时阅读了 JVM规范并对如何编译String上的switch语句感兴趣。这是我检查的测试方法(JDK1.7.0_40):

I read JVM specification on compiling switches and became interested in how switch statement on String is compiled. Here is the test method I examined (JDK1.7.0_40):

static int test(String i) {
    switch (i) {
        case "a":  return  -100;
        case "45b":  return  1;
        case "c":  return  2;
        default: return -1;
    }
}

我希望将此方法编译成简单的lookupswitch on hashCode of string,但突然

I expect this method to be compiled into simple lookupswitch on hashCode of string, but suddenly

static int test(java.lang.String);
Code:
   0: aload_0
   1: astore_1
   2: iconst_m1
   3: istore_2
   4: aload_1
   5: invokevirtual #6         // Method java/lang/String.hashCode:()I
   8: lookupswitch  { // 3
                97: 44
                99: 72
             51713: 58
           default: 83
      }
  44: aload_1
  45: ldc           #7 // String a
  47: invokevirtual #8 // Method java/lang/String.equals:(Ljava/lang/Object;)Z
  50: ifeq          83
  53: iconst_0
  54: istore_2
  55: goto          83
  58: aload_1
  59: ldc           #9  // String 45b
  61: invokevirtual #8  // Method java/lang/String.equals:(Ljava/lang/Object;)Z
  64: ifeq          83
  67: iconst_1
  68: istore_2
  69: goto          83
  72: aload_1
  73: ldc           #10 // String c
  75: invokevirtual #8  // Method java/lang/String.equals:(Ljava/lang/Object;)Z
  78: ifeq          83
  81: iconst_2
  82: istore_2
  83: iload_2
  84: tableswitch   { // 0 to 2
                 0: 112
                 1: 115
                 2: 117
           default: 119
      }
 112: bipush        -100
 114: ireturn
 115: iconst_1
 116: ireturn
 117: iconst_2
 118: ireturn
 119: iconst_m1
 120: ireturn

正如你所看到的,在第一个lookupswitch的分支中,JVM并没有为后续的tableswitch(第84行)做真正的工作生成索引。

Tableswitch应该快速工作所以不会带来很多额外的工作。但无论如何,产生额外开关的目的是什么?

更新

我理解hashCode冲突的可能性。我想说的是,编译器可以将所有实际工作从后续的tableswitch移到first,然后使用ifeq跳转到所有switch分支的末尾,而不是后续的tableswitch。 所以我在这里看到一个可能的答案:在第一个开关编译器尝试根据已知的案例数量为ifeq jump预先计算标签,但我不确定这是唯一的原因。


Update2

建议 @ericbn ,我尝试编译

As you could see, in branches of first lookupswitch JVM is not doing the real work generating indices for subsequent tableswitch (line 84) instead.
Tableswitch should work fast so doesn't bring a lot of extra work. But anyway, what is the purpose of generating additional switch?
Update
I understand possibility of hashCode collisions. What I'm trying to say is that instead of subsequent tableswitch, compiler could move all the real work from subsequent tableswitch to first and then use ifeq to jump to the end of all switch branches. So the one possible answer I see here: that in first switch compiler tries to precompute label for ifeq jump based on known number of cases, but I'm not sure that it is the only reason.

Update2
As @ericbn suggested, I tried to compile

switch (i) { 
   case 97: return -100; 
   case 51713: return 1; 
   case 99: return 2; 
   default: return -1;
}

i为int,编译器给了我简单的lookupswitch。

with i as int and compiler gave me plain lookupswitch.

推荐答案

来自javac源代码

         * The general approach used is to translate a single
         * string switch statement into a series of two chained
         * switch statements: the first a synthesized statement
         * switching on the argument string's hash value and
         * computing a string's position in the list of original
         * case labels, if any, followed by a second switch on the
         * computed integer value.  The second switch has the same
         * code structure as the original string switch statement
         * except that the string case labels are replaced with
         * positional integer constants starting at 0.
         *
         * The first switch statement can be thought of as an
         * inlined map from strings to their position in the case
         * label list.  An alternate implementation would use an
         * actual Map for this purpose, as done for enum switches.
         *
         * With some additional effort, it would be possible to
         * use a single switch statement on the hash code of the
         * argument, but care would need to be taken to preserve
         * the proper control flow in the presence of hash
         * collisions and other complications, such as
         * fallthroughs.  Switch statements with one or two
         * alternatives could also be specially translated into
         * if-then statements to omit the computation of the hash
         * code.

这篇关于为什么要将String编译切换为两个开关的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆