第一次Java循环运行时,为什么? [Sun HotSpot 1.5,sparc] [英] First time a Java loop is run SLOW, why? [Sun HotSpot 1.5, sparc]

查看:152
本文介绍了第一次Java循环运行时,为什么? [Sun HotSpot 1.5,sparc]的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在对Solaris SPARC盒子上的一些Java代码进行基准测试时,我注意到我第一次调用基准测试函数时,它运行得很慢(10倍差异):




  • 首先| 1 | 25295.979 ms

  • 第二个| 1 | 2256.990 ms

  • 第三个| 1 | 2250.575 ms



这是为什么?我怀疑是JIT编译器,有没有办法验证这个?



编辑:根据一些答案,我想澄清一下这段代码是最简单的
可能的测试用例我可以找到展示这种行为。所以我的目标不是让b $ b来快速运行,而是为了了解发生了什么,所以我可以在真正的
基准测试中避免它。



解决: Tom Hawtin正确地指出我的慢时间实际上是合理的。
根据这个观察,我将一个调试器附加到Java进程。在第一个中,内部循环看起来像这样:

  0xf9037218:cmp%l0,100 
0xf903721c:bge, pn%icc,0xf90371f4! 0xf90371f4
0xf9037220:nop
0xf9037224:ld [%l3 + 92],%l2
0xf9037228:ld [%l2 + 8],%l6
0xf903722c:添加%l6, 1,%l5
0xf9037230:st%l5,[%l2 + 8]
0xf9037234:inc%l0
0xf9037238:ld [%l1],%g0
0xf903723c:ba ,pt%icc,0xf9037218! 0xf9037218

在以下迭代中,循环如下所示:

  0xf90377d4:sub%l2,%l0,%l3 
0xf90377d8:add%l3,%l0,%l2
0xf90377dc:add%l2 ,1,%l4
0xf90377e0:inc%l0
0xf90377e4:cmp%l0,100
0xf90377e8:bl,pn%icc,0xf90377d8! 0xf90377d8

因此,HotSpot从内部循环中删除了内存访问,将其加速了一个数量级。 / p>

课程:算一算!我应该自己完成Tom的计算。



基准Java代码:

 私人柜台; 
private int nThreads;

private void measure(String tag)抛出异常{
MyThread threads [] = new MyThread [nThreads];
int i;

counter = 0;

for(i = 0; i< nThreads; i ++)
threads [i] = new MyThread();

long start = System.nanoTime();

for(i = 0; i< nThreads; i ++)
threads [i] .start();

for(i = 0; i< nThreads; i ++)
threads [i] .join();

if(tag!= null)
System.out.format(% - 20s |%-2d |%。3f ms \ n,tag,nThreads,
new Double((System.nanoTime() - start)/ 1000000.0));
}
public MyBench(){
try {
this.nThreads = 1;
measure(First);
measure(Second);
measure(Third);
} catch(例外e){
System.out.println(错误:+ e);
}
}

私有类MyThread扩展线程{
public void run(){
while(counter< 10000000){
// work
for(int j = 0; j< 100; j ++)
counter ++;
counter - = 99;
}
}
}


解决方案

一些丑陋,不切实际的代码(微基准测试的东西):

  while(counter< 10000000){
// work
for(int j = 0; j< 100; j ++)
counter ++;
counter - = 99;
}

那么这是做什么以及运行速度有多快。



内部循环递增计数器100次,然后计数器递减99.因此增量为1.注意计数器是外部类的成员变量,因此有一些开销。然后运行10,000,000次。因此内循环运行1,000,000,000次。



使用to accessor方法的循环,称之为25个循环。 1 GHz时1,000,000,000次,给出25秒。



嘿,我们预测 SLOW 时间。慢的时候很快。快速时间是在基准以某种方式被打破之后 - 一次迭代2.5个周期?使用-server可能会发现它变得更加愚蠢。


In benchmarking some Java code on a Solaris SPARC box, I noticed that the first time I call the benchmarked function it runs EXTREMELY slowly (10x difference):

  • First | 1 | 25295.979 ms
  • Second | 1 | 2256.990 ms
  • Third | 1 | 2250.575 ms

Why is this? I suspect the JIT compiler, is there any way to verify this?

Edit: In light of some answers I wanted to clarify that this code is the simplest possible test-case I could find exhibiting this behavior. So my goal isn't to get it to run fast, but to understand what's going on so I can avoid it in my real benchmarks.

Solved: Tom Hawtin correctly pointed out that my "SLOW" time was actually reasonable. Following this observation, I attached a debugger to the Java process. During the first, the inner loop looks like this:

0xf9037218:     cmp      %l0, 100
0xf903721c:     bge,pn   %icc,0xf90371f4        ! 0xf90371f4
0xf9037220:     nop
0xf9037224:     ld       [%l3 + 92], %l2
0xf9037228:     ld       [%l2 + 8], %l6
0xf903722c:     add      %l6, 1, %l5
0xf9037230:     st       %l5, [%l2 + 8]
0xf9037234:     inc      %l0
0xf9037238:     ld       [%l1], %g0
0xf903723c:     ba,pt    %icc,0xf9037218        ! 0xf9037218

On the following iterations, the loop looks like this:

0xf90377d4:     sub      %l2, %l0, %l3
0xf90377d8:     add      %l3, %l0, %l2
0xf90377dc:     add      %l2, 1, %l4
0xf90377e0:     inc      %l0
0xf90377e4:     cmp      %l0, 100
0xf90377e8:     bl,pn    %icc,0xf90377d8        ! 0xf90377d8

So HotSpot removed memory accesses from the inner loop, speeding it up by an order of magnitude.

Lesson: Do the math! I should have done Tom's calculation myself.

Benchmark Java code:

    private int counter;
    private int nThreads;

    private void measure(String tag) throws Exception {
            MyThread threads[] = new MyThread[nThreads];
            int i;

            counter = 0;

            for (i = 0; i < nThreads; i++)
                    threads[i] = new MyThread();

            long start = System.nanoTime();

            for (i = 0; i < nThreads; i++)
                    threads[i].start();

            for (i = 0; i < nThreads; i++)
                    threads[i].join();

            if (tag != null)
                    System.out.format("%-20s | %-2d | %.3f ms \n", tag, nThreads,
                                     new Double((System.nanoTime() - start) / 1000000.0));
    }
    public MyBench() {
            try {
                    this.nThreads = 1;
                    measure("First");
                    measure("Second");
                    measure("Third");
            } catch (Exception e) {
                    System.out.println("Error: " + e);
            }
    }

    private class MyThread extends Thread {
            public void run() {
                    while (counter < 10000000) {
                            // work
                            for (int j = 0; j < 100; j++)
                                    counter++;
                            counter -= 99;
                    }
            }
    }

解决方案

Some ugly, unrealistic code (the stuff of microbenchmarks):

                while (counter < 10000000) {
                        // work
                        for (int j = 0; j < 100; j++)
                                counter++;
                        counter -= 99;
                }

So what is this doing and how fast should it run.

The inner loop increments counter 100 times, then the counter is decremented by 99. So an increment of 1. Note counter is a member variable of an outer class, so some overhead there. This is then run 10,000,000 times. So the inner loop is run 1,000,000,000 times.

A loop using to accessor methods, call it 25 cycles. 1,000,000,000 times at 1 GHz, gives 25s.

Hey, we predicted the SLOW time. The slow time is fast. The fast times are after the benchmark has been broken in some way - 2.5 cycles an iteration? Use -server and you might find it gets even more silly.

这篇关于第一次Java循环运行时,为什么? [Sun HotSpot 1.5,sparc]的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆