优化者被高估了 [英] optimizers are overrated

查看:51
本文介绍了优化者被高估了的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

优化器被高估


我不久前开始学习ASM,以提高我对

硬件架构的理解以及优化C代码的能力。我的第一次实验的结果至少令人惊讶。在我的ASM书中阅读了循环章节

后,我想测试现代C编译器是否真的像通常声称的那样聪明。我选择了一个最简单的循环:调用

putchar 100次。第一个函数(foo)使用典型的C风格循环来测试编译器将优化比人类更好的b $ b优化的假设。第二个函数(bar)基于我新获得的知识,

循环基本上是用C语言编写的。现在我确定我是否在这里问过

哪一个更多高效的你们所有人都会回复编译器最多可能会在两种情况下产生相同的代码。 (我已经读过这样的说法

无数次)。好吧,看看下面的ASM输出,看看你的假设是多么错误




/ * C Code * /


void foo(无效)

{

int i;


for(i = 0; i< 100; i ++)putchar(''a'');

}

void bar(无效)

{

int i = 100;


做{

putchar(''a'');

} while( - -i);

}


正如我所说,该死的很简单。没有令人讨厌的副作用,无法访问全局

变量等。优化器没有任何借口。它应该在两种情况下生成最优的

代码。但看到结果:

/ * x86 / Windows上的GCC 4.3.0,-O2 * /


/ * foo * /

L7:

subl $ 12,%esp

pushl $ 97

来电_putchar


包含%ebx

addl $ 16,%esp

cmpl $ 100,%ebx

jne L7

/ * bar * /

L2:

subl $ 12,%esp

pushl $ 97

call _putchar

addl $ 16,%esp


decl%ebx

jne L2

评论:请参阅,即使是最新版本可能最广泛使用的

编译器无法正确优化最简单的循环!至少GCC

理解条形循环,所以我的像ASM一样写C。优化工作。


此时你可能想知道当GCC已经失败时,普通C编译器将会做什么可怕的事情。这是令人毛骨悚然的结果:

/ * lccwin32,优化* /


/ * foo * /

_ $ 4:

pushl $ 97

来电_putchar

popl%ecx


包含%edi

cmpl $ 100,%edi

jl _ $ 4

/ * bar * /

_ $ 10:

pushl $ 97

来电_putchar

popl%ecx


movl%edi,%eax

decl%eax

movl%eax,%edi

或%eax,%eax

jne _ $ 10

评论:lcc无法像GCC一样优化循环,但它实际上为ASM风格的循环生成了更糟糕的代码,从而增加了对伤害的侮辱!

所以你甚至不能自己优化循环!

/ * MS Visual C ++ 6 / O2 * /


对于这个编译器,我不得不用调用替换putchar调用自定义

my_putchar函数否则编译器用

直接操作系统API替换putchar调用。虽然这是一个很好的优化,但它不是这个测试的主题,只会让得到的asm更难阅读,所以我对b $ b施加压力。

/ * foo * /


jmp SHORT $ L833

$ L834:

mov eax,DWORD PTR _i $ [ebp]

添加eax,1

mov DWORD PTR _i $ [ebp],eax

$ L833:

cmp DWORD PTR _i $ [ebp],100

jge SHORT $ L835


push 97

call _my_putchar

添加esp,4


jmp SHORT $ L834

$ L835:

/ * bar * /


$ L840:

推97

调用_my_putchar

添加esp,4


mov eax,DWORD PTR _i $ [ebp]

sub eax,1

mov DWORD PTR _i $ [ebp],eax

cmp DWORD PTR _i $ [ebp],0

jne SHORT $ L840

评论:令人惊讶的是,这个编译器还找到了另一种方式拧螺丝

起来。您是否认为每个编译器为

这样一个简单的构造生成不同的代码?

我希望您同意该野兽的编译器值得奖励最差的

显示"为了这个烂摊子。 MS编译器仍然很糟糕吗?

Optimizers are overrated

I started learning ASM not long ago to improve my understanding of the
hardware architecture and my ability to optimize C code. The results of my
first experiment were surprising to say at least. After reading the chapter
on loops in my ASM book I wanted to test whether modern C compilers are
actually as smart as commonly claimed. I chose a most simple loop: calling
putchar 100 times. The first function (foo) uses a typical C style loop to
test the assumption that "the compiler will optimize that better than any
human could". The second function (bar) is based my newly gained knowledge,
the loop is basically ASM written in C. Now I am certain if I asked here
which one is more efficient all you guys would reply "the compiler will most
likely generate the same code in both cases" (I have read such claims
countless times here). Well, look at the ASM output below to see how wrong
your assumption is.

/* C Code */

void foo(void)
{
int i;

for (i = 0; i < 100; i++) putchar(''a'');
}
void bar(void)
{
int i = 100;

do {
putchar(''a'');
} while (--i);
}

As I said, damn simple. No nasty side effects, no access to global
variables, etc. The optimizer has no excuses. It should generate optimial
code in both cases. But see the result:
/* GCC 4.3.0 on x86/Windows, -O2 */

/* foo */
L7:
subl $12, %esp
pushl $97
call _putchar

incl %ebx
addl $16, %esp
cmpl $100, %ebx
jne L7
/* bar */
L2:
subl $12, %esp
pushl $97
call _putchar
addl $16, %esp

decl %ebx
jne L2
Comment: See, even the most recent version of the probably most widely used
compiler can not correctly optimize a most simple loop! At least GCC
understood the bar loop, so my "write C like ASM" optimization worked.

At this point you might wonder what horrible things an average C compiler
will do when GCC already fails so badly. Here is the gruesome result:
/* lccwin32, optimize on */

/* foo */
_$4:
pushl $97
call _putchar
popl %ecx

incl %edi
cmpl $100,%edi
jl _$4
/* bar */
_$10:
pushl $97
call _putchar
popl %ecx

movl %edi,%eax
decl %eax
movl %eax,%edi
or %eax,%eax
jne _$10
Comment: lcc is unable to optimize the loop just like GCC, but it adds
insults to injury by actually generating worse code for the ASM-style loop!
So you cannot even optimize the loop yourself!
/* MS Visual C++ 6 /O2 */

For this compiler I had to replace the putchar call with a call to a custom
my_putchar function otherwise the compiler replaces the putchar calls with
direct OS API stuff. While this is a good optimization it is not the
subject of this test, and only makes the resulting asm harder to read, so I
supressed that.
/* foo */

jmp SHORT $L833
$L834:
mov eax, DWORD PTR _i$[ebp]
add eax, 1
mov DWORD PTR _i$[ebp], eax
$L833:
cmp DWORD PTR _i$[ebp], 100
jge SHORT $L835

push 97
call _my_putchar
add esp, 4

jmp SHORT $L834
$L835:
/* bar */

$L840:
push 97
call _my_putchar
add esp, 4

mov eax, DWORD PTR _i$[ebp]
sub eax, 1
mov DWORD PTR _i$[ebp], eax
cmp DWORD PTR _i$[ebp], 0
jne SHORT $L840
Comment: Amazingly enough, this compiler has found yet another way to screw
up. Would you have thought that each compiler generates different code for
such a simple construct?
I hope you agree that the compiler of the beast deserves the award "Worst of
Show" for this mess. Are MS compilers still this bad?


推荐答案

12,%esp

pushl
12, %esp
pushl


97

来电_putchar


包含%ebx

addl
97
call _putchar

incl %ebx
addl

< br>

16,%esp

cmpl
16, %esp
cmpl


这篇关于优化者被高估了的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆