为什么编译器这么蠢? [英] Why are compilers so stupid?

查看:150
本文介绍了为什么编译器这么蠢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我总是想知道为什么编译器不能弄清楚对人眼很明显的简单的东西。他们做了很多简单的优化,但从来没有什么东西,有点复杂。例如,此代码在我的计算机上花费大约6秒来打印值零(使用java 1.6):

  int x = 0; 
for(int i = 0; i <100 * 1000 * 1000 * 1000; ++ i){
x + = x + x + x + x + x;
}
System.out.println(x);

很明显,x从不改变,所以无论你多久添加0,零。所以编译器可以在理论上用System.out.println(0)替换它。



或者更好,这需要23秒:

  public int slow(){
String s =x;
for(int i = 0; i <100000; ++ i){
s + =x;
}
return 10;
}



首先编译器可能注意到我实际上创建一个字符串s 100000 x,所以它可以自动使用s StringBuilder,或者甚至更好地直接替换为结果字符串,因为它总是相同的。第二,它不认识到我根本不使用字符串,所以整个循环可以丢弃!



为什么,在这么多人力进入快速

:当然,这些都是愚蠢的例子,不应该在任何地方使用。但是每当我把一个漂亮,非常可读的代码重写成不可读的代码,使编译器快乐并生成快速代码,我不知道为什么编译器或一些其他自动化工具不能为我做这项工作。

解决方案

哦,我不知道。有时编译器是相当聪明。考虑下面的C程序:

  #include< stdio.h> / * printf()* / 

int factorial(int n){
return n == 0? 1:n *阶乘(n-1);
}

int main(){
int n = 10;

printf(factorial(%d)=%d\\\
,n,factorial(n));

return 0;
}

在我的版本 GCC (4.3.2 on Debian 测试),当没有优化编译时,或-O1,它生成factorial()的代码,就像你所期望的,使用递归调用来计算值。但是在-O2,它做一些有趣的事情:它编译到一个紧的循环:

  factorial:
.LFB13 :
testl%edi,%edi
movl $ 1,%eax
je .L3
.p2align 4,and 10
.p2align 3
.L4 :
imull%edi,%eax
subl $ 1,%edi
jne .L4
.L3:
rep
ret

相当令人印象深刻。递归调用(甚至尾递归)已被完全消除,因此阶乘现在使用O(1)堆栈空​​间而不是O(N)。虽然我只有非常肤浅的x86程序集(在这种情况下实际上是AMD64,但我不认为任何AMD64扩展正在使用上面),我怀疑你可以手工编写一个更好的版本。但是真正让我想起的是它在-O3上生成的代码。阶乘的实现保持不变。但是main()已更改:

  main:
.LFB14:
subq $ 8,%rsp
.LCFI0:
movl $ 3628800,%edx
movl $ 10,%esi
movl $ .LC0,%edi
xorl%eax,%eax
call printf
xorl%eax,%eax
addq $ 8,%rsp
ret

请参阅 movl $ 3628800,%edx 行? gcc是在编译时预先计算阶乘(10)。它甚至不调用factorial()。难以置信。



当然,所有通常的免责声明都适用,这只是一个玩具的例子,过早的优化是所有邪恶的根源,等等,但它说明,编译器通常比你想象的更聪明。如果你认为你可以用手做更好的工作,你几乎肯定是错误的。



(改编自在我的博客上发布。)


I always wonder why compilers can't figure out simple things that are obvious to the human eye. They do lots of simple optimizations, but never something even a little bit complex. For example, this code takes about 6 seconds on my computer to print the value zero (using java 1.6):

int x = 0;
for (int i = 0; i < 100 * 1000 * 1000 * 1000; ++i) {
    x += x + x + x + x + x;
}
System.out.println(x);

It is totally obvious that x is never changed so no matter how often you add 0 to itself it stays zero. So the compiler could in theory replace this with System.out.println(0).

Or even better, this takes 23 seconds:

public int slow() {
    String s = "x";
    for (int i = 0; i < 100000; ++i) {
        s += "x";
    }
    return 10;
}

First the compiler could notice that I am actually creating a string s of 100000 "x" so it could automatically use s StringBuilder instead, or even better directly replace it with the resulting string as it is always the same. Second, It does not recognize that I do not actually use the string at all, so the whole loop could be discarded!

Why, after so much manpower is going into fast compilers, are they still so relatively dumb?

EDIT: Of course these are stupid examples that should never be used anywhere. But whenever I have to rewrite a beautiful and very readable code into something unreadable so that the compiler is happy and produces fast code, I wonder why compilers or some other automated tool can't do this work for me.

解决方案

Oh, I don't know. Sometimes compilers are pretty smart. Consider the following C program:

#include <stdio.h>  /* printf() */

int factorial(int n) {
   return n == 0 ? 1 : n * factorial(n - 1);
}

int main() {
   int n = 10;

   printf("factorial(%d) = %d\n", n, factorial(n));

   return 0;
}

On my version of GCC (4.3.2 on Debian testing), when compiled with no optimizaitons, or -O1, it generates code for factorial() like you'd expect, using a recursive call to compute the value. But on -O2, it does something interesting: It compiles down to a tight loop:

    factorial:
   .LFB13:
           testl   %edi, %edi
           movl    $1, %eax
           je  .L3
           .p2align 4,,10
           .p2align 3
   .L4:
           imull   %edi, %eax
           subl    $1, %edi
           jne .L4
   .L3:
           rep
           ret

Pretty impressive. The recursive call (not even tail-recursive) has been completely eliminated, so factorial now uses O(1) stack space instead of O(N). And although I have only very superficial knowledge of x86 assembly (actually AMD64 in this case, but I don't think any of the AMD64 extensions are being used above), I doubt that you could write a better version by hand. But what really blew my mind was the code that it generated on -O3. The implementation of factorial stayed the same. But main() changed:

    main:
   .LFB14:
           subq    $8, %rsp
   .LCFI0:
           movl    $3628800, %edx
           movl    $10, %esi
           movl    $.LC0, %edi
           xorl    %eax, %eax
           call    printf
           xorl    %eax, %eax
           addq    $8, %rsp
           ret

See the movl $3628800, %edx line? gcc is pre-computing factorial(10) at compile-time. It doesn't even call factorial(). Incredible. My hat is off to the GCC development team.

Of course, all the usual disclaimers apply, this is just a toy example, premature optimization is the root of all evil, etc, etc, but it illustrates that compilers are often smarter than you think. If you think you can do a better job by hand, you're almost certainly wrong.

(Adapted from a posting on my blog.)

这篇关于为什么编译器这么蠢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆