一些运行/ JIT的故障后Java循环变慢了? [英] Java loop gets slower after some runs / JIT's fault?

查看:146
本文介绍了一些运行/ JIT的故障后Java循环变慢了?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以我想对一些基本的java功能进行基准测试,以便为这个问题添加一些信息:将方法声明为静态有什么好处。

So I wanted to benchmark some basic java functionality to add some imformation to this question: What is the gain from declaring a method as static.

我知道编写基准有时并不容易,但这里发生了什么我无法解释。

I know writing benchmarks is sometimes not easy but what happens here I cannot explain.

请注意,我并没有考虑如何解决这个问题,而是为什么会发生这种情况*

测试类:

public class TestPerformanceOfStaticVsDynamicCalls {

    private static final long RUNS = 1_000_000_000L;

    public static void main( String [] args ){

        new TestPerformanceOfStaticVsDynamicCalls().run();
    }

    private void run(){

        long r=0;
        long start, end;

        for( int loop = 0; loop<10; loop++ ){

            // Benchmark

            start = System.currentTimeMillis();
            for( long i = 0; i < RUNS; i++ ) {
                r += addStatic( 1, i );
            }
            end = System.currentTimeMillis();
            System.out.println( "Static: " + ( end - start ) + " ms" );

            start = System.currentTimeMillis();
            for( long i = 0; i < RUNS; i++ ) {
                r += addDynamic( 1, i );
            }
            end = System.currentTimeMillis();
            System.out.println( "Dynamic: " + ( end - start ) + " ms" );

            // Do something with r to keep compiler happy
            System.out.println( r );

        }

    }


    private long addDynamic( long a, long b ){

        return a+b;
    }

    private static long addStatic( long a, long b ){

        return a+b;
    }

}

我期待第一次循环到热身和以下循环更快。

I was expecting the first loop to be warmup and the following loops to be faster.

在Eclipse中运行它会产生奇怪的结果:

Running this in Eclipse gave those strange results:

Static: 621 ms
Dynamic: 631 ms
1000000001000000000
Static: 2257 ms
Dynamic: 2501 ms
2000000002000000000
Static: 2258 ms
Dynamic: 2469 ms
3000000003000000000
Static: 2231 ms
Dynamic: 2464 ms
4000000004000000000

那么wtf?它变慢了。要进行交叉检查,我使用java / c 7运行相同的代码:

So wtf? It got slower. To crosscheck I run the same code with java/c 7:

Static: 620 ms
Dynamic: 627 ms
1000000001000000000
Static: 897 ms
Dynamic: 617 ms
2000000002000000000
Static: 901 ms
Dynamic: 615 ms
3000000003000000000
Static: 888 ms
Dynamic: 616 ms
4000000004000000000

所以这里只有静态调用对于以下循环变慢。更奇怪的是,如果我在最后一个循环后重新排列代码只打印 r 我在Eclipse中得到这个:

So here only the static calls got slower for the following loops. Even more strange if I rearange the code to only print r after the final loop I get this in Eclipse:

Static: 620 ms
Dynamic: 635 ms
Static: 2285 ms
Dynamic: 893 ms
Static: 2258 ms
Dynamic: 900 ms
Static: 2280 ms
Dynamic: 905 ms
4000000004000000000

这与java / c 7:

And this with java/c 7:

Static: 620 ms
Dynamic: 623 ms
Static: 890 ms
Dynamic: 614 ms
Static: 890 ms
Dynamic: 616 ms
Static: 886 ms
Dynamic: 614 ms
4000000004000000000

在更改eclipse中动态/静态基准测试的顺序时:

while at it changing the order of the dynamic/static benchmarks in eclipse:

Dynamic: 618 ms
Static: 626 ms
1000000001000000000
Dynamic: 632 ms
Static: 2524 ms
2000000002000000000
Dynamic: 617 ms
Static: 2528 ms
3000000003000000000
Dynamic: 622 ms
Static: 2506 ms
4000000004000000000

和java / c 7:

and in java/c 7:

Dynamic: 625 ms
Static: 646 ms
1000000001000000000
Dynamic: 2470 ms
Static: 633 ms
2000000002000000000
Dynamic: 2459 ms
Static: 635 ms
3000000003000000000
Dynamic: 2464 ms
Static: 645 ms
4000000004000000000

那么这里发生了什么?

编辑:一些系统信息:

Java version "1.7.0_55"
OpenJDK Runtime Environment (IcedTea 2.4.7) (7u55-2.4.7-1ubuntu1)
OpenJDK 64-Bit Server VM (build 24.51-b03, mixed mode)

Intel(R) Core(TM) i7-2720QM CPU @ 2.20GHz

EDIT2:

使用Java8:

Static: 620 ms
Dynamic: 624 ms
1000000001000000000
Static: 890 ms
Dynamic: 618 ms
2000000002000000000
Static: 891 ms
Dynamic: 616 ms
3000000003000000000
Static: 892 ms
Dynamic: 617 ms
4000000004000000000

其他代码排序在这里产生类似的奇怪(但是更好)的结果。

Other code orderings produce similar strange (but ohter) results here.

推荐答案

序言:手动编写微基准测试几乎总是注定要失败。

那里
框架已经解决了常见的基准测试问题。

Preamble: Writing microbenchmarks manually is almost always doomed to a failure.
There are frameworks that have already solved the common benchmarking problems.


  1. JIT编译单元是一种方法。将几个基准标记合并到一个方法中会导致不可预测的结果。

  1. JIT compilation unit is a method. Incorporating several benchmarks into a single method leads to unpredictable results.

JIT严重依赖于执行配置文件,即运行时统计信息。如果方法长时间运行第一个场景,JIT将优化生成的代码。当方法突然切换到另一种情况时,不要指望它以相同的速度运行。

JIT heavily relies on the execution profile, i.e. the run-time statistics. If a method runs the first scenario for a long time, JIT will optimize the generated code for it. When the method suddenly switches to another scenario, do not expect it to run at the same speed.

JIT可能会跳过优化未执行的代码。它将为此代码留下一个不常见的陷阱。如果陷阱被击中,JVM将取消优化编译的方法,切换到解释器,然后用新知识重新编译代码。例如。当您的方法 run 在第一个热循环中第一次编译时,JIT不知道 System.out.println println ,早期编译的代码很可能会被去优化。

JIT may skip optimizing the code that is not executed. It will leave an uncommon trap for this code. If the trap is ever hit, JVM will deoptimize the compiled method, switch to interpreter and after that recompile the code with the new knowledge. E.g. when your method run is compiled for the first time inside the first hot loop, JIT does not know about System.out.println yet. As soon as the execution reaches println, the earlier compiled code is likely to get deoptimized.

方法越大 - 为JIT编译器优化它就越困难。例如。可能看起来没有足够的备用寄存器来保存所有局部变量。这就是你的情况。

The bigger is method - the harder is to optimize it for JIT compiler. E.g. it may appear that there is not enough spare registers to hold all local variables. That's what happen in your case.

总结一下,你的基准测试似乎通过以下场景:

To sum it up, your benchmark seem to pass through the following scenario:


  1. 第一个热循环( addStatic )触发运行方法。执行配置文件除了 addStatic 方法之外什么都不知道。

  2. System.out.println 触发deoptimization,然后第二个热循环( addDynamic )导致重新编译 run 方法。 / li>
  3. 现在执行配置文件仅包含有关 addDynamic 的信息,因此JIT优化了第二个循环,并且第一个循环似乎有额外的寄存器溢出:

  1. The first hot loop (addStatic) triggers the compilation of run method. The execution profile does not know anything except addStatic method.
  2. System.out.println triggers the deoptimization and after that the second hot loop (addDynamic) leads to recompilation of run method.
  3. Now the execution profile contains information only about addDynamic, so JIT optimizes the second loop, and the first loop appears to have extra register spills:

优化循环:

0x0000000002d01054: add    %rbx,%r14
0x0000000002d01057: add    $0x1,%rbx          ;*ladd
                                              ; - TestPerformanceOfStaticVsDynamicCalls::addDynamic@2
                                              ; - TestPerformanceOfStaticVsDynamicCalls::run@105

0x0000000002d0105b: add    $0x1,%r14          ; OopMap{rbp=Oop off=127}
                                              ;*goto
                                              ; - TestPerformanceOfStaticVsDynamicCalls::run@116

0x0000000002d0105f: test   %eax,-0x1c91065(%rip)        # 0x0000000001070000
                                              ;*lload
                                              ; - TestPerformanceOfStaticVsDynamicCalls::run@92
                                              ;   {poll}
0x0000000002d01065: cmp    $0x3b9aca00,%rbx
0x0000000002d0106c: jl     0x0000000002d01054

带有额外寄存器溢出的循环:

Loop with an extra register spill:

0x0000000002d011d0: mov    0x28(%rsp),%r11  <---- the problem is here
0x0000000002d011d5: add    %r10,%r11
0x0000000002d011d8: add    $0x1,%r10
0x0000000002d011dc: add    $0x1,%r11
0x0000000002d011e0: mov    %r11,0x28(%rsp)    ;*ladd
                                              ; - TestPerformanceOfStaticVsDynamicCalls::addStatic@2
                                              ; - TestPerformanceOfStaticVsDynamicCalls::run@33

0x0000000002d011e5: mov    0x28(%rsp),%r11  <---- the problem is here
0x0000000002d011ea: add    $0x1,%r11          ; OopMap{[32]=Oop off=526}
                                              ;*goto
                                              ; - TestPerformanceOfStaticVsDynamicCalls::run@44

0x0000000002d011ee: test   %eax,-0x1c911f4(%rip)        # 0x0000000001070000
                                              ;*goto
                                              ; - TestPerformanceOfStaticVsDynamicCalls::run@44
                                              ;   {poll}
0x0000000002d011f4: cmp    $0x3b9aca00,%r10
0x0000000002d011fb: jl     0x0000000002d011d0  ;*ifge
                                              ; - TestPerformanceOfStaticVsDynamicCalls::run@25

PS 以下JVM选项很有用分析JIT编译:

P.S. The following JVM options are useful to analyze the JIT compilation:

-XX:+PrintCompilation -XX:+UnlockDiagnosticVMOptions -XX:+PrintInlining -XX:+PrintAssembly -XX:CompileOnly=TestPerformanceOfStaticVsDynamicCalls

这篇关于一些运行/ JIT的故障后Java循环变慢了?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆