对于小型数组,为什么Arrays.copyOf比System.arraycopy快2倍? [英] Why is Arrays.copyOf 2 times faster than System.arraycopy for small arrays?

查看:141
本文介绍了对于小型数组,为什么Arrays.copyOf比System.arraycopy快2倍?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我最近玩了一些基准测试,发现了非常有趣的结果,我现在无法解释。以下是基准:

I was recently playing with some benchmarks and found very interesting results that I can't explain right now. Here is the benchmark:

@BenchmarkMode(Mode.Throughput)
@Fork(1)
@State(Scope.Thread)
@Warmup(iterations = 10, time = 1, batchSize = 1000)
@Measurement(iterations = 10, time = 1, batchSize = 1000)
public class ArrayCopy {

    @Param({"1","5","10","100", "1000"})
    private int size;
    private int[] ar;

    @Setup
    public void setup() {
        ar = new int[size];
        for (int i = 0; i < size; i++) {
            ar[i] = i;
        }
    }

    @Benchmark
    public int[] SystemArrayCopy() {
        final int length = size;
        int[] result = new int[length];
        System.arraycopy(ar, 0, result, 0, length);
        return result;
    }

    @Benchmark
    public int[] javaArrayCopy() {
        final int length = size;
        int[] result = new int[length];
        for (int i = 0; i < length; i++) {
            result[i] = ar[i];
        }
        return result;
    }

    @Benchmark
    public int[] arraysCopyOf() {
        final int length = size;
        return Arrays.copyOf(ar, length);
    }

}

结果:

Benchmark                  (size)   Mode  Cnt       Score      Error  Units
ArrayCopy.SystemArrayCopy       1  thrpt   10   52533.503 ± 2938.553  ops/s
ArrayCopy.SystemArrayCopy       5  thrpt   10   52518.875 ± 4973.229  ops/s
ArrayCopy.SystemArrayCopy      10  thrpt   10   53527.400 ± 4291.669  ops/s
ArrayCopy.SystemArrayCopy     100  thrpt   10   18948.334 ±  929.156  ops/s
ArrayCopy.SystemArrayCopy    1000  thrpt   10    2782.739 ±  184.484  ops/s
ArrayCopy.arraysCopyOf          1  thrpt   10  111665.763 ± 8928.007  ops/s
ArrayCopy.arraysCopyOf          5  thrpt   10   97358.978 ± 5457.597  ops/s
ArrayCopy.arraysCopyOf         10  thrpt   10   93523.975 ± 9282.989  ops/s
ArrayCopy.arraysCopyOf        100  thrpt   10   19716.960 ±  728.051  ops/s
ArrayCopy.arraysCopyOf       1000  thrpt   10    1897.061 ±  242.788  ops/s
ArrayCopy.javaArrayCopy         1  thrpt   10   58053.872 ± 4955.749  ops/s
ArrayCopy.javaArrayCopy         5  thrpt   10   49708.647 ± 3579.826  ops/s
ArrayCopy.javaArrayCopy        10  thrpt   10   48111.857 ± 4603.024  ops/s
ArrayCopy.javaArrayCopy       100  thrpt   10   18768.866 ±  445.238  ops/s
ArrayCopy.javaArrayCopy      1000  thrpt   10    2462.207 ±  126.549  ops/s

所以这里有两件奇怪的事情:

So there are two strange things here:


  • Arrays.copyOf 对于小
    数组(1,5,10大小),比 System.arraycopy 快2倍。但是,在大型数组1000
    Arrays.copyOf 变得几乎慢2倍。我知道两个
    方法都是内在的,所以我期望性能相同。
    这个差异来自哪里?

  • 1元素数组的手动复制比 System.arraycopy 快。我不清楚为什么。有人知道吗?

  • Arrays.copyOf is 2 times faster than System.arraycopy for small arrays (1,5,10 size). However, on a large array of size 1000 Arrays.copyOf becomes almost 2 times slower. I know that both methods are intrinsics, so I would expect the same performance. Where does this difference come from?
  • Manual copy for a 1-element array is faster than System.arraycopy. It's not clear to me why. Does anybody know?

VM版本:JDK 1.8.0_131,VM 25.131-b11

VM version: JDK 1.8.0_131, VM 25.131-b11

推荐答案

您的 SystemArrayCopy 基准测试在语义上不等同于 arraysCopyOf

Your SystemArrayCopy benchmark is not semantically equivalent to arraysCopyOf.

如果你更换

    System.arraycopy(ar, 0, result, 0, length);

with

    System.arraycopy(ar, 0, result, 0, Math.min(ar.length, length));

通过此更改,两个基准的表现也将变得相似。

With this change the performance of both benchmarks will also become similar.

为什么第一个变种比较慢?

Why is the first variant slower then?


  1. 不知道长度 ar.length JVM需要执行额外的边界检查,并准备抛出 IndexOutOfBoundsException 长度> ar.length

  2. 这也会破坏优化以消除冗余归零。您知道,每个已分配的数组必须用零初始化。但是,如果JIT在创建后立即填充数组,则可以避免归零。但是 -prof perfasm 清楚地表明原始的 SystemArrayCopy 基准测试花费了大量时间来清除分配的数组:

  1. Without knowing how length relates to ar.length JVM needs to perform additional bounds check and be prepared to throw IndexOutOfBoundsException when length > ar.length.
  2. This also breaks the optimization to eliminate redundant zeroing. You know, every allocated array must be initialized with zeros. However, JIT can avoid zeroing if it sees that the array is filled right after creation. But -prof perfasm clearly shows that the original SystemArrayCopy benchmark spends significant amount of time clearing the allocated array:

 0,84%    0x000000000365d35f: shr    $0x3,%rcx
 0,06%    0x000000000365d363: add    $0xfffffffffffffffe,%rcx
 0,69%    0x000000000365d367: xor    %rax,%rax
          0x000000000365d36a: shl    $0x3,%rcx
21,02%    0x000000000365d36e: rep rex.W stos %al,%es:(%rdi)  ;*newarray


手动复制出现得更快对于小数组,因为与 System.arraycopy 不同,它不会对VM函数执行任何运行时调用。

Manual copy appeared faster for small arrays, because unlike System.arraycopy it does not perform any runtime calls to VM functions.

这篇关于对于小型数组,为什么Arrays.copyOf比System.arraycopy快2倍?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆