为什么在 x64 Java 中 long 比 int 慢? [英] Why is long slower than int in x64 Java?

查看:44
本文介绍了为什么在 x64 Java 中 long 比 int 慢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在 Surface Pro 2 平板电脑上运行带有 Java 7 更新 45 x64(未安装 32 位 Java)的 Windows 8.1 x64.

I'm running Windows 8.1 x64 with Java 7 update 45 x64 (no 32 bit Java installed) on a Surface Pro 2 tablet.

下面的代码在 i 的类型为 long 时需要 1688 毫秒,当 i 为 int 时需要 109 毫秒.为什么在具有 64 位 JVM 的 64 位平台上,long(64 位类型)比 int 慢一个数量级?

The code below takes 1688ms when the type of i is a long and 109ms when i is an int. Why is long (a 64 bit type) an order of magnitude slower than int on a 64 bit platform with a 64 bit JVM?

我唯一的猜测是 CPU 添加 64 位整数所需的时间比添加 32 位整数的时间长,但这似乎不太可能.我怀疑 Haswell 没有使用波纹进位加法器.

My only speculation is that the CPU takes longer to add a 64 bit integer than a 32 bit one, but that seems unlikely. I suspect Haswell doesn't use ripple-carry adders.

顺便说一下,我正在 Eclipse Kepler SR1 中运行它.

I'm running this in Eclipse Kepler SR1, btw.

public class Main {

    private static long i = Integer.MAX_VALUE;

    public static void main(String[] args) {    
        System.out.println("Starting the loop");
        long startTime = System.currentTimeMillis();
        while(!decrementAndCheck()){
        }
        long endTime = System.currentTimeMillis();
        System.out.println("Finished the loop in " + (endTime - startTime) + "ms");
    }

    private static boolean decrementAndCheck() {
        return --i < 0;
    }

}

以下是相同系统的 VS 2013(如下)编译的等效 C++ 代码的结果.long: 72265ms int: 74656ms 这些结果是在调试 32 位模式下.

Here are the results from equivalent C++ code compiled by VS 2013 (below), same system. long: 72265ms int: 74656ms Those results were in debug 32 bit mode.

在 64 位发布模式下:long: 875ms long long: 906ms int: 1047ms

In 64 bit release mode: long: 875ms long long: 906ms int: 1047ms

这表明我观察到的结果是 JVM 优化的怪异,而不是 CPU 限制.

This suggests that the result I observed is JVM optimization weirdness rather than CPU limitations.

#include "stdafx.h"
#include "iostream"
#include "windows.h"
#include "limits.h"

long long i = INT_MAX;

using namespace std;


boolean decrementAndCheck() {
return --i < 0;
}


int _tmain(int argc, _TCHAR* argv[])
{


cout << "Starting the loop" << endl;

unsigned long startTime = GetTickCount64();
while (!decrementAndCheck()){
}
unsigned long endTime = GetTickCount64();

cout << "Finished the loop in " << (endTime - startTime) << "ms" << endl;



}

刚刚在 Java 8 RTM 中再次尝试了这个,没有显着变化.

Just tried this again in Java 8 RTM, no significant change.

推荐答案

当你使用 longs 时,我的 JVM 对内循环做了这个非常简单的事情:

My JVM does this pretty straightforward thing to the inner loop when you use longs:

0x00007fdd859dbb80: test   %eax,0x5f7847a(%rip)  /* fun JVM hack */
0x00007fdd859dbb86: dec    %r11                  /* i-- */
0x00007fdd859dbb89: mov    %r11,0x258(%r10)      /* store i to memory */
0x00007fdd859dbb90: test   %r11,%r11             /* unnecessary test */
0x00007fdd859dbb93: jge    0x00007fdd859dbb80    /* go back to the loop top */

当你使用 ints 时,它会作弊,很难.首先有一些我没有声称理解但看起来像展开循环的设置:

It cheats, hard, when you use ints; first there's some screwiness that I don't claim to understand but looks like setup for an unrolled loop:

0x00007f3dc290b5a1: mov    %r11d,%r9d
0x00007f3dc290b5a4: dec    %r9d
0x00007f3dc290b5a7: mov    %r9d,0x258(%r10)
0x00007f3dc290b5ae: test   %r9d,%r9d
0x00007f3dc290b5b1: jl     0x00007f3dc290b662
0x00007f3dc290b5b7: add    $0xfffffffffffffffe,%r11d
0x00007f3dc290b5bb: mov    %r9d,%ecx
0x00007f3dc290b5be: dec    %ecx              
0x00007f3dc290b5c0: mov    %ecx,0x258(%r10)   
0x00007f3dc290b5c7: cmp    %r11d,%ecx
0x00007f3dc290b5ca: jle    0x00007f3dc290b5d1
0x00007f3dc290b5cc: mov    %ecx,%r9d
0x00007f3dc290b5cf: jmp    0x00007f3dc290b5bb
0x00007f3dc290b5d1: and    $0xfffffffffffffffe,%r9d
0x00007f3dc290b5d5: mov    %r9d,%r8d
0x00007f3dc290b5d8: neg    %r8d
0x00007f3dc290b5db: sar    $0x1f,%r8d
0x00007f3dc290b5df: shr    $0x1f,%r8d
0x00007f3dc290b5e3: sub    %r9d,%r8d
0x00007f3dc290b5e6: sar    %r8d
0x00007f3dc290b5e9: neg    %r8d
0x00007f3dc290b5ec: and    $0xfffffffffffffffe,%r8d
0x00007f3dc290b5f0: shl    %r8d
0x00007f3dc290b5f3: mov    %r8d,%r11d
0x00007f3dc290b5f6: neg    %r11d
0x00007f3dc290b5f9: sar    $0x1f,%r11d
0x00007f3dc290b5fd: shr    $0x1e,%r11d
0x00007f3dc290b601: sub    %r8d,%r11d
0x00007f3dc290b604: sar    $0x2,%r11d
0x00007f3dc290b608: neg    %r11d
0x00007f3dc290b60b: and    $0xfffffffffffffffe,%r11d
0x00007f3dc290b60f: shl    $0x2,%r11d
0x00007f3dc290b613: mov    %r11d,%r9d
0x00007f3dc290b616: neg    %r9d
0x00007f3dc290b619: sar    $0x1f,%r9d
0x00007f3dc290b61d: shr    $0x1d,%r9d
0x00007f3dc290b621: sub    %r11d,%r9d
0x00007f3dc290b624: sar    $0x3,%r9d
0x00007f3dc290b628: neg    %r9d
0x00007f3dc290b62b: and    $0xfffffffffffffffe,%r9d
0x00007f3dc290b62f: shl    $0x3,%r9d
0x00007f3dc290b633: mov    %ecx,%r11d
0x00007f3dc290b636: sub    %r9d,%r11d
0x00007f3dc290b639: cmp    %r11d,%ecx
0x00007f3dc290b63c: jle    0x00007f3dc290b64f
0x00007f3dc290b63e: xchg   %ax,%ax /* OK, fine; I know what a nop looks like */

然后是展开的循环本身:

then the unrolled loop itself:

0x00007f3dc290b640: add    $0xfffffffffffffff0,%ecx
0x00007f3dc290b643: mov    %ecx,0x258(%r10)
0x00007f3dc290b64a: cmp    %r11d,%ecx
0x00007f3dc290b64d: jg     0x00007f3dc290b640

然后是展开循环的拆卸代码,它本身就是一个测试和一个直循环:

then the teardown code for the unrolled loop, itself a test and a straight loop:

0x00007f3dc290b64f: cmp    $0xffffffffffffffff,%ecx
0x00007f3dc290b652: jle    0x00007f3dc290b662
0x00007f3dc290b654: dec    %ecx
0x00007f3dc290b656: mov    %ecx,0x258(%r10)
0x00007f3dc290b65d: cmp    $0xffffffffffffffff,%ecx
0x00007f3dc290b660: jg     0x00007f3dc290b654

因此对于整数,它的速度提高了 16 倍,因为 JIT 将 int 循环展开了 16 次,但根本没有展开 long 循环.

So it goes 16 times faster for ints because the JIT unrolled the int loop 16 times, but didn't unroll the long loop at all.

为了完整起见,这是我实际尝试过的代码:

For completeness, here is the code I actually tried:

public class foo136 {
  private static int i = Integer.MAX_VALUE;
  public static void main(String[] args) {
    System.out.println("Starting the loop");
    for (int foo = 0; foo < 100; foo++)
      doit();
  }

  static void doit() {
    i = Integer.MAX_VALUE;
    long startTime = System.currentTimeMillis();
    while(!decrementAndCheck()){
    }
    long endTime = System.currentTimeMillis();
    System.out.println("Finished the loop in " + (endTime - startTime) + "ms");
  }

  private static boolean decrementAndCheck() {
    return --i < 0;
  }
}

程序集转储是使用选项 -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly 生成的.请注意,您需要弄乱 JVM 安装才能使这项工作也适合您;您需要将一些随机共享库放在正确的位置,否则它会失败.

The assembly dumps were generated using the options -XX:+UnlockDiagnosticVMOptions -XX:+PrintAssembly. Note that you need to mess around with your JVM installation to have this work for you as well; you need to put some random shared library in exactly the right place or it will fail.

这篇关于为什么在 x64 Java 中 long 比 int 慢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆