为什么边界检查没有被消除? [英] Why the bounds check doesn't get eliminated?

查看:31
本文介绍了为什么边界检查没有被消除?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我写了一个简单的基准找出当数组通过按位和计算时是否可以消除边界检查.这基本上是几乎所有哈希表所做的:他们计算

I wrote a simple benchmark in order to find out if bounds check can be eliminated when the array gets computed via bitwise and. This is basically what nearly all hash tables do: They compute

h & (table.length - 1)

作为 table 的索引,其中 hhashCode 或派生值.结果 表明边界检查没有被消除.

as an index into the table, where h is the hashCode or a derived value. The results shows that the bounds check don't get eliminated.

我的基准测试的想法非常简单:计算两个值 ij,其中两个值都保证是有效的数组索引.

The idea of my benchmark is pretty simple: Compute two values i and j, where both are guaranteed to be valid array indexes.

  • i 是循环计数器.当它被用作数组索引时,边界检查就被消除了.
  • j 被计算为 x &(table.length - 1),其中 x 是每次迭代都会改变的某个值.当它被用作数组索引时,边界检查不会被消除.
  • i is the loop counter. When it gets used as array index, the bounds check gets eliminated.
  • j gets computed as x & (table.length - 1), where x is some value changing on each iteration. When it gets used as array index, the bounds check does not get eliminated.

相关部分如下:

for (int i=0; i<=table.length-1; ++i) {
    x += result;
    final int j = x & (table.length-1);
    result ^= i + table[j];
}

其他实验使用

    result ^= table[i] + j;

相反.时间上的差异可能是 15%(在我尝试过的不同变体中几乎一致).我的问题:

instead. The difference in timing is maybe 15% (pretty consistently across different variants I've tried). My questions:

  • 除了绑定检查消除之外,还有其他可能的原因吗?
  • 是否有一些复杂的原因我看不出为什么没有对 j 进行边界检查消除?
  • Are there other possible reasons for this besides bound check elimination?
  • Is there some complicated reason I can't see why there's no bound check elimination for j?

MarkoTopolnik 的回答表明,这一切都更加复杂,并不能保证消除边界检查一定会成功,尤其是在他的计算机上,正常"代码比屏蔽"慢.我想这是因为它允许一些额外的优化,这在这种情况下实际上是有害的(考虑到当前 CPU 的复杂性,编译器甚至不确定).

MarkoTopolnik's answer shows that it's all more complicated and the elimination of the bounds checks is not guaranteed to be a win, especially on his computer the "normal" code is slower than "masked". I guess this is because of it allowing some additional optimization which shows to be actually detrimental in this case (given the complexity of the current CPUs, the compiler hardly even knows for sure).

leventov 的回答清楚地表明数组边界检查是在屏蔽"中完成的.并且它的消除使代码和正常"一样快.

leventov's answer shows clearly that the array bounds check gets done in "masked" and that it's elimination makes the code as fast as "normal".

Donal Fellows 指出这样一个事实,即屏蔽不适用于零长度表,如 x &(0-1) 等于 x.所以编译器能做的最好的事情就是用零长度检查替换边界检查.但恕我直言,这仍然值得,因为零长度检查可以轻松移出循环.

Donal Fellows points to the fact, that the masking doesn't work for a zero-length table, as x & (0-1) equals to x. So the best thing the compiler can do is to replace the bound check by a zero-length check. But this is IMHO still worth it, as the zero-length check can be moved out of the loop easily.

因为等价 a[x &(a.length - 1)] 当且仅当 a.length == 0 时抛出,编译器可以执行以下操作:

Because of the the equivalence a[x & (a.length - 1)] throws if and only if a.length == 0, the compiler can do the following:

  • 对于每个数组访问,检查索引是否是通过按位与计算出来的.
  • 如果是,请检查任一操作数是否计算为长度减一.
  • 如果是,请用零长度检查替换边界检查.
  • 让现有的优化来处理它.

这样的优化应该非常简单和廉价,因为它只查看 SSA图.与许多复杂的优化不同,它永远不会有害,因为它只会用稍微简单的检查替换一项检查;所以没有问题,即使它不能移出循环.

Such an optimization should be pretty simple and cheap as it only looks at the parent nodes in the SSA graph. Unlike many complex optimizations, it can never be detrimental, as it only replaces one check by a slightly simpler one; so there's no problem, not even if it can't be moved out of the loop.

我会将其发布到 hotspot-dev 邮件列表.

I'll post this to the hotspot-dev mailing lists.

John Rose 提交了RFE 并且已经有一个快速和-脏"补丁.

John Rose filed an RFE and there's already a "quick-and-dirty" patch.

推荐答案

  1. 不,这显然是由于没有足够的智能边界检查消除造成的.

我扩展了 Marko Topolnik 的基准:

I've extended a benchmark by Marko Topolnik:

@OutputTimeUnit(TimeUnit.NANOSECONDS)
@BenchmarkMode(Mode.AverageTime)
@OperationsPerInvocation(BCElimination.N)
@Warmup(iterations = 5, time = 1)
@Measurement(iterations = 10, time = 1)
@State(Scope.Thread)
@Threads(1)
@Fork(2)
public class BCElimination {
    public static final int N = 1024;
    private static final Unsafe U;
    private static final long INT_BASE;
    private static final long INT_SCALE;
    static {
        try {
            Field f = Unsafe.class.getDeclaredField("theUnsafe");
            f.setAccessible(true);
            U = (Unsafe) f.get(null);
        } catch (Exception e) {
            throw new IllegalStateException(e);
        }

        INT_BASE = U.arrayBaseOffset(int[].class);
        INT_SCALE = U.arrayIndexScale(int[].class);
    }

    private final int[] table = new int[BCElimination.N];

    @Setup public void setUp() {
        final Random random = new Random();
        for (int i=0; i<table.length; ++i) table[i] = random.nextInt();
    }

    @GenerateMicroBenchmark public int normalIndex() {
        int result = 0;
        final int[] table = this.table;
        int x = 0;
        for (int i=0; i<=table.length-1; ++i) {
            x += i;
            final int j = x & (table.length-1);
            result ^= table[i] + j;
        }
        return result;
    }

    @GenerateMicroBenchmark public int maskedIndex() {
        int result = 0;
        final int[] table = this.table;
        int x = 0;
        for (int i=0; i<=table.length-1; ++i) {
            x += i;
            final int j = x & (table.length-1);
            result ^= i + table[j];
        }
        return result;
    }

    @GenerateMicroBenchmark public int maskedIndexUnsafe() {
        int result = 0;
        final int[] table = this.table;
        long x = 0;
        for (int i=0; i<=table.length-1; ++i) {
            x += i * INT_SCALE;
            final long j = x & ((table.length-1) * INT_SCALE);
            result ^= i + U.getInt(table, INT_BASE + j);
        }
        return result;
    }
}

结果:

Benchmark                                Mean   Mean error    Units
BCElimination.maskedIndex               1,235        0,004    ns/op
BCElimination.maskedIndexUnsafe         1,092        0,007    ns/op
BCElimination.normalIndex               1,071        0,008    ns/op


2. 第二个问题是针对 hotspot-dev 邮件列表而不是 StackOverflow,恕我直言.


2. The second question is for hotspot-dev mailing lists rather than StackOverflow, IMHO.

这篇关于为什么边界检查没有被消除?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆