为什么导致 StackOverflowError 的递归方法的调用次数在程序运行之间会有所不同? [英] Why does the count of calls of a recursive method causing a StackOverflowError vary between program runs?

查看:23
本文介绍了为什么导致 StackOverflowError 的递归方法的调用次数在程序运行之间会有所不同?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

一个用于演示目的的简单类:

A simple class for demonstration purposes:

public class Main {

    private static int counter = 0;

    public static void main(String[] args) {
        try {
            f();
        } catch (StackOverflowError e) {
            System.out.println(counter);
        }
    }

    private static void f() {
        counter++;
        f();
    }
}

上面的程序我执行了5次,结果是:

I executed the above program 5 times, the results are:

22025
22117
15234
21993
21430

为什么每次的结果都不一样?

Why are the results different each time?

我尝试设置最大堆栈大小(例如 -Xss256k).然后结果更加一致,但每次都不相等.

I tried setting the max stack size (for example -Xss256k). The results were then a bit more consistent but again not equal each time.

Java 版本:

java version "1.8.0_72"
Java(TM) SE Runtime Environment (build 1.8.0_72-b15)
Java HotSpot(TM) 64-Bit Server VM (build 25.72-b15, mixed mode)

编辑

当 JIT 被禁用 (-Djava.compiler=NONE) 时,我总是得到相同的数字 (11907).

When JIT is disabled (-Djava.compiler=NONE) I always get the same number (11907).

这是有道理的,因为 JIT 优化可能会影响堆栈帧的大小,而且 JIT 完成的工作肯定会因执行而异.

This makes sense as JIT optimizations are probably affecting the size of stack frames and the work done by JIT definitely has to vary between the executions.

尽管如此,我认为如果通过参考有关该主题的一些文档和/或 JIT 在导致帧大小更改的特定示例中完成的工作的具体示例来证实该理论,那将是有益的.

Nevertheless, I think it would be beneficial if this theory is confirmed with references to some documentation about the topic and/or concrete examples of work done by JIT in this specific example that leads to frame size changes.

推荐答案

观察到的差异是由后台 JIT 编译引起的.

The observed variance is caused by background JIT compilation.

流程如下:

  1. 方法 f() 在解释器中开始执行.
  2. 经过多次调用(大约 250 次)后,该方法被安排进行编译.
  3. 编译器线程与应用程序线程并行工作.同时该方法在解释器中继续执行.
  4. 一旦编译器线程完成编译,方法入口点就会被替换,因此对 f() 的下一次调用将调用该方法的已编译版本.
  1. Method f() starts execution in interpreter.
  2. After a number of invocations (around 250) the method is scheduled for compilation.
  3. The compiler thread works in parallel to the application thread. Meanwhile the method continues execution in interpreter.
  4. As soon as the compiler thread finishes compilation, the method entry point is replaced, so the next call to f() will invoke the compiled version of the method.

应用程序线程和 JIT 编译器线程之间基本上存在竞争.在方法的编译版本准备好之前,解释器可能会执行不同数量的调用.最后是解释和编译帧的混合.

There is basically a race between applcation thread and JIT compiler thread. Interpreter may perform different number of calls before the compiled version of the method is ready. At the end there is a mix of interpreted and compiled frames.

难怪编译的框架布局与解释的框架布局不同.编译后的帧通常较小;他们不需要在堆栈上存储所有执行上下文(方法引用、常量池引用、分析器数据、所有参数、表达式变量等)

No wonder that compiled frame layout differs from interpreted one. Compiled frames are usually smaller; they don't need to store all the execution context on the stack (method reference, constant pool reference, profiler data, all arguments, expression variables etc.)

此外,分层编译(自 JDK 8 起默认).可以有 3 种类型的帧的组合:解释器、C1 和 C2(见下文).

Futhermore, there is even more race possibilities with Tiered Compilation (default since JDK 8). There can be a combination of 3 types of frames: interpreter, C1 and C2 (see below).

让我们做一些有趣的实验来支持这个理论.

  1. 纯解释模式.没有 JIT 编译.
    没有比赛 => 稳定的结果.

  1. Pure interpreted mode. No JIT compilation.
    No races => stable results.

$ java -Xint Main
11895
11895
11895

  • 禁用后台编译.JIT 已开启,但与应用程序线程同步.
    再次没有比赛,但由于已编译的帧,调用次数现在更高.

  • Disable background compilation. JIT is ON, but is synchronized with the application thread.
    No races again, but the number of calls is now higher due to compiled frames.

    $ java -XX:-BackgroundCompilation Main
    23462
    23462
    23462
    

  • 在执行之前用 C1 编译所有内容.与之前的情况不同,堆栈上将没有解释的帧,因此数量会更高一些.

  • Compile everything with C1 before execution. Unlike previous case there will be no interpreted frames on the stack, so the number will be a bit higher.

    $ java -Xcomp -XX:TieredStopAtLevel=1 Main
    23720
    23720
    23720
    

  • 现在用 C2 执行之前编译所有内容.这将生成具有最小帧的最优化代码.调用次数将是最高的.

  • Now compile everything with C2 before execution. This will produce the most optimized code with the smallest frame. The number of calls will be the highest.

    $ java -Xcomp -XX:-TieredCompilation Main
    59300
    59300
    59300
    

    由于默认堆栈大小为 1M,这应该意味着帧现在只有 16 字节长.是吗?

    Since the default stack size is 1M, this should mean the frame now is only 16 bytes long. Is it?

    $ java -Xcomp -XX:-TieredCompilation -XX:CompileCommand=print,Main.f Main
    
      0x00000000025ab460: mov    %eax,-0x6000(%rsp)    ; StackOverflow check
      0x00000000025ab467: push   %rbp                  ; frame link
      0x00000000025ab468: sub    $0x10,%rsp            
      0x00000000025ab46c: movabs $0xd7726ef0,%r10      ; r10 = Main.class
      0x00000000025ab476: addl   $0x2,0x68(%r10)       ; Main.counter += 2
      0x00000000025ab47b: callq  0x00000000023c6620    ; invokestatic f()
      0x00000000025ab480: add    $0x10,%rsp
      0x00000000025ab484: pop    %rbp                  ; pop frame
      0x00000000025ab485: test   %eax,-0x23bb48b(%rip) ; safepoint poll
      0x00000000025ab48b: retq
    

    其实这里的帧是32字节,只是JIT内联了一级递归.

    In fact, the frame here is 32 bytes, but JIT has inlined one level of recursion.

    最后,让我们看看混合堆栈跟踪.为了得到它,我们将在 StackOverflowError 上使 JVM 崩溃(在调试版本中可用的选项).

    Finally, let's look at the mixed stack trace. In order to get it, we'll crash JVM on StackOverflowError (option available in debug builds).

    $ java -XX:AbortVMOnException=java.lang.StackOverflowError Main
    

    故障转储 hs_err_pid.log 包含详细的堆栈跟踪,我们可以在底部找到解释的帧,中间是 C1 帧,最后是顶部的 C2 帧.

    The crash dump hs_err_pid.log contains the detailed stack trace where we can find interpreted frames at the bottom, C1 frames in the middle and lastly C2 frames on the top.

    Java frames: (J=compiled Java code, j=interpreted, Vv=VM code)
    J 164 C2 Main.f()V (12 bytes) @ 0x00007f21251a5958 [0x00007f21251a5900+0x0000000000000058]
    J 164 C2 Main.f()V (12 bytes) @ 0x00007f21251a5920 [0x00007f21251a5900+0x0000000000000020]
      // ... repeated 19787 times ...
    J 164 C2 Main.f()V (12 bytes) @ 0x00007f21251a5920 [0x00007f21251a5900+0x0000000000000020]
    J 163 C1 Main.f()V (12 bytes) @ 0x00007f211dca50ec [0x00007f211dca5040+0x00000000000000ac]
    J 163 C1 Main.f()V (12 bytes) @ 0x00007f211dca50ec [0x00007f211dca5040+0x00000000000000ac]
      // ... repeated 1866 times ...
    J 163 C1 Main.f()V (12 bytes) @ 0x00007f211dca50ec [0x00007f211dca5040+0x00000000000000ac]
    j  Main.f()V+8
    j  Main.f()V+8
      // ... repeated 1839 times ...
    j  Main.f()V+8
    j  Main.main([Ljava/lang/String;)V+0
    v  ~StubRoutines::call_stub
    

  • 这篇关于为什么导致 StackOverflowError 的递归方法的调用次数在程序运行之间会有所不同?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

    查看全文
    相关文章
    登录 关闭
    扫码关注1秒登录
    发送“验证码”获取 | 15天全站免登陆