编译器指令重新排序C ++中的优化(以及什么抑制它们) [英] Compiler instruction reordering optimizations in C++ (and what inhibits them)

查看:776
本文介绍了编译器指令重新排序C ++中的优化(以及什么抑制它们)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已将我的代码缩减为以下代码,这是简单的,因为我可以保留编译器输出感兴趣我。

I've reduced my code down to the following, which is as simple as I could make it whilst retaining the compiler output that interests me.

void foo(const uint64_t used)
{
    uint64_t ar[100];
    for(int i = 0; i < 100; ++i)
    {
        ar[i] = some_global_array[i];
    }

    const uint64_t mask = ar[0];
    if((used & mask) != 0)
    {
        return;
    }

    bar(ar); // Not inlined
}

使用VC10与/ O2和/ Ob1,生成的程序集几乎反映了上述C ++代码中指令的顺序。因为当条件失败时,局部数组 ar 只传递给 bar()预期编译器将优化为类似以下内容。

Using VC10 with /O2 and /Ob1, the generated assembly pretty much reflects the order of instructions in the above C++ code. Since the local array ar is only passed to bar() when the condition fails, and is otherwise unused, I would have expected the compiler to optimize to something like the following.

if((used & some_global_array[0]) != 0)
{
    return;
}

// Now do the copying to ar and call bar(ar)...

编译器不会这样做,因为它在一般情况下很难识别这样的优化?还是遵循一些严格的规则,禁止它这样做?如果是这样,为什么,并且有某种方式我可以给它一个提示,这样做不会改变我的程序的语义?

Is the compiler not doing this because it's simply too hard for it to identify such optimizations in the general case? Or is it following some strict rule that forbids it from doing so? If so, why, and is there some way I can give it a hint that doing so wouldn't change the semantics of my program?

注意:显然它会通过只重新安排代码来获得优化的输出,但我感兴趣的为什么编译器将不会优化在这种情况下,而不是 (有意简化)情况。

Note: obviously it would be trivial to obtain the optimized output by just rearranging the code, but I'm interested in why the compiler won't optimize in such cases, not how to do so in this (intentionally simplified) case.

推荐答案

没有严格规则控制允许编译器输出什么样的汇编语言。如果编译器可以确定一个代码块不需要执行(因为它没有副作用),由于一些前提条件,那么绝对允许短路整个事情。

There are no "strict rules" controlling what kind of assembly language the compiler is permitted to output. If the compiler can be certain that a block of code does not need to be executed (because it has no side effects) due to some precondition, then it is absolutely permitted to short-circuit the whole thing.

在一般情况下,这种优化可能相当复杂,你的编译器可能不去做所有的努力。如果这是性能关键代码,那么您可以微调源代码(如您所建议的),以帮助编译器生成最佳汇编代码。这是一个反复试验的过程,你可能必须再次为下一版本的编译器。

This sort of optimisation can be fairly complex in the general case, and your compiler might not go to all that effort. If this is performance critical code, then you can fine-tune your source code (as you suggest) to help the compiler generate the best assembly code. This is a trial-and-error process though, and you might have to do it again for the next version of the compiler.

这篇关于编译器指令重新排序C ++中的优化(以及什么抑制它们)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆