在编译过程中,什么时候(无效的代码)代码会被删除? [英] When will the trivial (code that has no effect) code gets removed in compilation process?

查看:158
本文介绍了在编译过程中,什么时候(无效的代码)代码会被删除?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

  volatile int num = 0; 
num = num + 10;

上面的C ++代码似乎在intel汇编中产生以下代码:

  mov DWORD PTR [rbp-4],0 
mov eax,DWORD PTR [rbp-4]
add eax,10
mov DWORD PTR [rbp-4],eax

如果我将C ++代码更改为

  volatile int num = 0; 
num = num + 0;

为什么编译器不会生成汇编代码:

  mov DWORD PTR [rbp-4],0 
mov eax,DWORD PTR [rbp-4]
add eax,0
mov DWORD PTR [rbp-4],eax

gcc7.2 -O0 省略了 add eax,0 ,但所有其他指令都是相同的(Godbolt)



在编译过程的哪一部分,这种不重要的代码会被删除。是否有任何编译器标志会使GCC编译器不执行这些类型的优化。

解决方案

clang会在<$ c $加上eax,0 c> -O0 ,但gcc,ICC和MSVC都不会。见下面。






gcc -O0 并不意味着没有优化。 gcc没有braindead literal translation模式,它试图将每个C表达式的每个组件直接转换为asm指令。



GCC的 -O0 不打算完全未优化。它的目的是编译快速,并使调试给出预期的结果(即使您使用调试器修改C变量,或跳转到函数内的其他行)。因此,假设内存可以通过在此块之前停止的调试器进行异步修改,它会溢出/重新加载每个C语句的所有内容。 (后果的有趣例子,以及更详细的解释:为什么整数除-1(负一)导致FPE?






gcc -O0 使代码变得更慢(例如,忘记 0 是添加剂的身份),所以没有人为此实施了一个选项。如果该行为是可选的,它甚至可能会使gcc变得更慢。 (或者也许有这样一个选项,但它在默认情况下,即使在 -O0 中也是如此,因为它速度很快,不会影响调试并且很有用。调试构建运行速度足够快以便可用,特别是对于大型或实时项目。)


$ b 由于@Basile Starynkevitch在,gcc总是通过它的内部表示法转换为执行可执行文件的方式。只要做到这一点,就会产生一些优化。



例如,即使在 -O0 ,gcc的除以常量算法使用定点乘法逆运算或移位(2的幂)而不是 idiv 指令。但是 clang -O0 会使用 idiv 作为 x / = 2






Clang的 -O0 优化的比gcc小也是如此:

pre $ void foo(void){
volatile int num = 0;
num = num + 0;
}

上Godbolt ASM输出的x86-64



  push rbp 
mov rbp,rsp

#你的asm块的问题,但用0而不是10
mov dword ptr [rbp - 4], 0
mov eax,dword ptr [rbp - 4]
add eax,0
mov dword ptr [rbp - 4],eax

pop rbp
ret

正如你所说,gcc遗漏了无用的 add eax,0 。 ICC17存储/重新加载多次。在调试模式下,MSVC通常非常具有字面意义,但即使它避免发出 add eax,0



Clang也是Godbolt上的4个x86编译器中只有一个将使用 idiv 作为 return x / 2; 。其他所有SAR + CMOV或其他任何实现C签署的分割语义。

volatile int num = 0;
num = num + 10;

The above C++ Code seems to produce following code in intel assembly:

mov DWORD PTR [rbp-4], 0
mov eax, DWORD PTR [rbp-4]
add eax, 10
mov DWORD PTR [rbp-4], eax

If I change C++ code to

volatile int num = 0;
num = num + 0;

why will not compiler produce assembly code as:

mov DWORD PTR [rbp-4], 0
mov eax, DWORD PTR [rbp-4]
add eax, 0
mov DWORD PTR [rbp-4], eax

gcc7.2 -O0 leaves out the add eax, 0, but all the other instructions are the same (Godbolt).

At which part of compilation process does this kind trivial code gets removed. Is there any compiler flag which will make GCC compiler to not do these kind of optimizations.

解决方案

clang will emit add eax, 0 at -O0, but none of gcc, ICC, nor MSVC will. See below.


gcc -O0 doesn't mean "no optimization". gcc doesn't have a "braindead literal translation" mode where it tries to transliterate every component of every C expression directly to an asm instruction.

GCC's -O0 is not intended to be totally un-optimized. It's intended to be "compile-fast" and make debugging give the expected results (even if you modify C variables with a debugger, or jump to a different line within the function). So it spills / reloads everything around every C statement, assuming that memory can be asynchronously modified by a debugger stopped before such a block. (Interesting example of the consequences, and a more detailed explanation: Why does integer division by -1 (negative one) result in FPE?)


There isn't much demand for gcc -O0 to make even slower code (e.g. forgetting that 0 is the additive identity), so nobody has implemented an option for that. And it might even make gcc slower if that behaviour was optional. (Or maybe there is such an option but it's on by default even at -O0, because it's fast, doesn't hurt debugging, and useful. Usually people like it when their debug builds run fast enough to be usable, especially for big or real-time projects.)

As @Basile Starynkevitch explains in Disable all optimization options in GCC, gcc always transforms through its internal representations on the way to making an executable. Just doing this at all results in some kinds of optimizations.

For example, even at -O0, gcc's "divide by a constant" algorithm uses a fixed-point multiplicative inverse or a shift (for powers of 2) instead of an idiv instruction. But clang -O0 will use idiv for x /= 2.


Clang's -O0 optimizes less than gcc's in this case, too:

void foo(void) {
    volatile int num = 0;
    num = num + 0;
}

asm output on Godbolt for x86-64

    push    rbp
    mov     rbp, rsp

    # your asm block from the question, but with 0 instead of 10
    mov     dword ptr [rbp - 4], 0
    mov     eax, dword ptr [rbp - 4]
    add     eax, 0
    mov     dword ptr [rbp - 4], eax

    pop     rbp
    ret

As you say, gcc leaves out the useless add eax,0. ICC17 stores/reloads multiple times. MSVC is usually extremely literal in debug mode, but even it avoids emitting add eax,0.

Clang is also the only one of the 4 x86 compilers on Godbolt that will use idiv for return x/2;. The others all SAR + CMOV or whatever to implement C's signed division semantics.

这篇关于在编译过程中,什么时候(无效的代码)代码会被删除?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆