为什么导致StackOverflowError的递归方法的调用计数在程序运行之间有所不同? [英] Why does the count of calls of a recursive method causing a StackOverflowError vary between program runs?

查看:108
本文介绍了为什么导致StackOverflowError的递归方法的调用计数在程序运行之间有所不同?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

用于演示目的的简单类:

  public class Main {

private static int counter = 0;

public static void main(String [] args){
try {
f();
} catch(StackOverflowError e){
System.out.println(counter);
}
}

private static void f(){
counter ++;
f();
}
}

我执行了上述程序5次,结果是:

  22025 
22117
15234
21993
21430

为什么每次结果都不同?



我试过设置最大堆栈大小(例如 -Xss256k )。然后结果更加一致,但每次都不一样。



Java版本:

  java版本1.8.0_72
Java(TM)SE运行时环境(版本1.8.0_72-b15)
Java HotSpot(TM)64位服务器VM(版本25.72) -b15,混合模式)

编辑



当JIT被禁用时( -Djava.compiler = NONE )我总是得到相同的数字( 11907 )。



这是有道理的,因为JIT优化可能会影响堆栈帧的大小,而JIT所做的工作肯定会在执行之间发生变化。 / p>

尽管如此,如果通过引用有关该主题的一些文档和/或JIT在这个特定示例中完成的工作的具体示例来确认该理论将是有益的。导致帧大小改变。

解决方案

观察到的方差是由背景JI引起的T编译



这是流程的样子:


  1. 方法 f()在解释器中开始执行。

  2. 在多次调用(大约250次)之后,该方法被安排进行编译。

  3. 编译器线程与应用程序线程并行工作。同时该方法继续在解释器中执行。

  4. 编译器线程完成编译后,方法入口点被替换,因此下一次调用 f()将调用该方法的编译版本。

applcation线程与JIT编译器线程之间基本上存在竞争。解释器可以在方法的编译版本准备好之前执行不同数量的调用。最后是解释和编译帧的混合。



难怪编译的帧布局与解释的帧布局不同。编译帧通常较小;他们不需要将所有执行上下文存储在堆栈上(方法引用,常量池引用,分析器数据,所有参数,表达式变量等)。



此外, 分层编译(自JDK 8起默认)。可以有3种类型的框架组合:翻译,C1和C2(见下文)。






让我们有一些有趣的实验来支持这个理论。


  1. 纯解释模式。没有JIT编译。

    没有比赛=>稳定的结果。

      $ java -Xint Main 
    11895
    11895
    11895


  2. 禁用背景编译。 JIT是ON,但是与应用程序线程同步。

    再没有比赛,但由于编译帧,调用次数现在更高。

      $ java -XX:-BackgroundCompilation Main 
    23462
    23462
    23462


  3. 在执行之前用C1 编译所有内容。与以前的情况不同,堆栈上不会有解释帧,因此数字会更高。

      $ java -Xcomp -XX:TieredStopAtLevel = 1 Main 
    23720
    23720
    23720


  4. 现在在执行之前使用C2 编译所有内容。这将生成具有最小帧的最优化代码。通话次数最多。

      $ java -Xcomp -XX:-TieredCompilation Main 
    59300
    59300
    59300

    由于默认堆栈大小为1M,这应该意味着框架现在只有16个字节长。是吗?

      $ java -Xcomp -XX:-TieredCompilation -XX:CompileCommand = print,Main.f Main 

    0x00000000025ab460:mov%eax,-0x6000(%rsp); StackOverflow检查
    0x00000000025ab467:push%rbp; frame link
    0x00000000025ab468:sub $ 0x10,%rsp
    0x00000000025ab46c:movabs $ 0xd7726ef0,%r10; r10 = Main.class
    0x00000000025ab476:addl $ 0x2,0x68(%r10); Main.counter + = 2
    0x00000000025ab47b:callq 0x00000000023c6620; invokestatic f()
    0x00000000025ab480:add $ 0x10,%rsp
    0x00000000025ab484:pop%rbp; pop框架
    0x00000000025ab485:test%eax,-0x23bb48b(%rip);安全点轮询
    0x00000000025ab48b:retq

    实际上,这里的帧是32字节,但是JIT已经内联了一个级别的递归。


  5. 最后,让我们看一下混合堆栈跟踪。为了得到它,我们将在StackOverflowError上崩溃JVM(调试版本中可用的选项)。

      $ java -XX: AbortVMOnException = java.lang.StackOverflowError Main 

    崩溃转储 hs_err_pid.log 包含详细的堆栈跟踪,我们可以在其中找到底部的解释帧,中间的C1帧和最后的C2帧。

      Java框架:(J =已编译的Java代码,j =已解释,Vv = VM代码)
    J 164 C2 Main.f()V(12字节)@ 0x00007f21251a5958 [0x00007f21251a5900 + 0x0000000000000058 ]
    J 164 C2 Main.f()V(12字节)@ 0x00007f21251a5920 [0x00007f21251a5900 + 0x0000000000000020]
    // ...重复19787次...
    J 164 C2 Main.f ()V(12字节)@ 0x00007f21251a5920 [0x00007f21251a5900 + 0x0000000000000020]
    J 163 C1 Main.f()V(12字节)@ 0x00007f211dca50ec [0x00007f211dca5040 + 0x00000000000000ac]
    J 163 C1 Main.f() V(12字节)@ 0x00007f211dca50ec [0x00007f211dca5040 + 0x00 000000000000ac]
    // ...重复1866次...
    J 163 C1 Main.f()V(12字节)@ 0x00007f211dca50ec [0x00007f211dca5040 + 0x00000000000000ac]
    j Main.f() V + 8
    j Main.f()V + 8
    // ...重复1839次...
    j Main.f()V + 8
    j Main.main( [Ljava / lang / String;)V + 0
    v~StubRoutines :: call_stub



A simple class for demonstration purposes:

public class Main {

    private static int counter = 0;

    public static void main(String[] args) {
        try {
            f();
        } catch (StackOverflowError e) {
            System.out.println(counter);
        }
    }

    private static void f() {
        counter++;
        f();
    }
}

I executed the above program 5 times, the results are:

22025
22117
15234
21993
21430

Why are the results different each time?

I tried setting the max stack size (for example -Xss256k). The results were then a bit more consistent but again not equal each time.

Java version:

java version "1.8.0_72"
Java(TM) SE Runtime Environment (build 1.8.0_72-b15)
Java HotSpot(TM) 64-Bit Server VM (build 25.72-b15, mixed mode)

EDIT

When JIT is disabled (-Djava.compiler=NONE) I always get the same number (11907).

This makes sense as JIT optimizations are probably affecting the size of stack frames and the work done by JIT definitely has to vary between the executions.

Nevertheless, I think it would be beneficial if this theory is confirmed with references to some documentation about the topic and/or concrete examples of work done by JIT in this specific example that leads to frame size changes.

解决方案

The observed variance is caused by background JIT compilation.

This is how the process looks like:

  1. Method f() starts execution in interpreter.
  2. After a number of invocations (around 250) the method is scheduled for compilation.
  3. The compiler thread works in parallel to the application thread. Meanwhile the method continues execution in interpreter.
  4. As soon as the compiler thread finishes compilation, the method entry point is replaced, so the next call to f() will invoke the compiled version of the method.

There is basically a race between applcation thread and JIT compiler thread. Interpreter may perform different number of calls before the compiled version of the method is ready. At the end there is a mix of interpreted and compiled frames.

No wonder that compiled frame layout differs from interpreted one. Compiled frames are usually smaller; they don't need to store all the execution context on the stack (method reference, constant pool reference, profiler data, all arguments, expression variables etc.)

Futhermore, there is even more race possibilities with Tiered Compilation (default since JDK 8). There can be a combination of 3 types of frames: interpreter, C1 and C2 (see below).


Let's have some fun experiments to support the theory.

  1. Pure interpreted mode. No JIT compilation.
    No races => stable results.

    $ java -Xint Main
    11895
    11895
    11895
    

  2. Disable background compilation. JIT is ON, but is synchronized with the application thread.
    No races again, but the number of calls is now higher due to compiled frames.

    $ java -XX:-BackgroundCompilation Main
    23462
    23462
    23462
    

  3. Compile everything with C1 before execution. Unlike previous case there will be no interpreted frames on the stack, so the number will be a bit higher.

    $ java -Xcomp -XX:TieredStopAtLevel=1 Main
    23720
    23720
    23720
    

  4. Now compile everything with C2 before execution. This will produce the most optimized code with the smallest frame. The number of calls will be the highest.

    $ java -Xcomp -XX:-TieredCompilation Main
    59300
    59300
    59300
    

    Since the default stack size is 1M, this should mean the frame now is only 16 bytes long. Is it?

    $ java -Xcomp -XX:-TieredCompilation -XX:CompileCommand=print,Main.f Main
    
      0x00000000025ab460: mov    %eax,-0x6000(%rsp)    ; StackOverflow check
      0x00000000025ab467: push   %rbp                  ; frame link
      0x00000000025ab468: sub    $0x10,%rsp            
      0x00000000025ab46c: movabs $0xd7726ef0,%r10      ; r10 = Main.class
      0x00000000025ab476: addl   $0x2,0x68(%r10)       ; Main.counter += 2
      0x00000000025ab47b: callq  0x00000000023c6620    ; invokestatic f()
      0x00000000025ab480: add    $0x10,%rsp
      0x00000000025ab484: pop    %rbp                  ; pop frame
      0x00000000025ab485: test   %eax,-0x23bb48b(%rip) ; safepoint poll
      0x00000000025ab48b: retq
    

    In fact, the frame here is 32 bytes, but JIT has inlined one level of recursion.

  5. Finally, let's look at the mixed stack trace. In order to get it, we'll crash JVM on StackOverflowError (option available in debug builds).

    $ java -XX:AbortVMOnException=java.lang.StackOverflowError Main
    

    The crash dump hs_err_pid.log contains the detailed stack trace where we can find interpreted frames at the bottom, C1 frames in the middle and lastly C2 frames on the top.

    Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
    J 164 C2 Main.f()V (12 bytes) @ 0x00007f21251a5958 [0x00007f21251a5900+0x0000000000000058]
    J 164 C2 Main.f()V (12 bytes) @ 0x00007f21251a5920 [0x00007f21251a5900+0x0000000000000020]
      // ... repeated 19787 times ...
    J 164 C2 Main.f()V (12 bytes) @ 0x00007f21251a5920 [0x00007f21251a5900+0x0000000000000020]
    J 163 C1 Main.f()V (12 bytes) @ 0x00007f211dca50ec [0x00007f211dca5040+0x00000000000000ac]
    J 163 C1 Main.f()V (12 bytes) @ 0x00007f211dca50ec [0x00007f211dca5040+0x00000000000000ac]
      // ... repeated 1866 times ...
    J 163 C1 Main.f()V (12 bytes) @ 0x00007f211dca50ec [0x00007f211dca5040+0x00000000000000ac]
    j  Main.f()V+8
    j  Main.f()V+8
      // ... repeated 1839 times ...
    j  Main.f()V+8
    j  Main.main([Ljava/lang/String;)V+0
    v  ~StubRoutines::call_stub
    

这篇关于为什么导致StackOverflowError的递归方法的调用计数在程序运行之间有所不同?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆