为什么三元和逻辑运算符比if分支更有效率? [英] Why are ternary and logical operators more efficient than if branches?

查看:450
本文介绍了为什么三元和逻辑运算符比if分支更有效率?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我偶然发现了这个问题/答案,其中提到在大多数语言中,逻辑运算符例如:

  x == y&做一点事(); 

比使用 if 分支:

  if(x == y){
doSomething();
}

类似地,它表示三元运算符:

  x = y == z? 0:1 

通常比使用 if 分支:

  if(y == z){
x = 0;
} else {
x = 1;
}

这让我谷歌搜索,这导致我这个奇妙的答案,它解释了分支预测。



基本上,它说的是CPU以非常快的速度运行,而不是减慢计算每个 分支,它将尝试猜测会发生什么结果,并将适当的指令放入其管道中。但是,如果作出错误的猜测,则必须备份并重新计算适当的指令。



但这仍然无法向我解释为什么使用逻辑运算符或三元运算符运算符与 if 分支的区别。由于CPU不知道 x == y 的结果,因此它是否还不必猜测是否将调用放置到 doSomething( )(因此,所有 doSomething 的代码)都已插入其管道?并且,因此,如果其猜测不正确,请备份?同样,对于三元运算符,在确定存储在 x中的内容时,CPU是否不必猜测 y == z 的取值是否为true。 ,如果它的猜测是错误的,该备份吗?



我不明白为什么编译器对分支的处理方式与其他任何处理方式都不一样有条件的陈述。

解决方案

简短答案-事实并非如此。

虽然帮助分支预测可以提高您的性能-将其用作逻辑语句不会改变编译后的代码。
如果您想帮助分支预测,请使用__builtin_expect(对于GNU)



为了强调这一点,让我们比较一下编译器的输出:

  #include< stdio.h> 


int main(){
int foo;

scanf(%d,& foo); / *需要消除优化* /

#ifdef IF
if(foo)
printf( Foo!);
#else
foo&& printf( Foo!);
#endif
返回0;
}

对于gcc -O3 branch.c -DIF
我们得到:

  0000000000400540< main> ;: 
400540:48 83 ec 18 sub $ 0x18,%rsp
400544:31 c0 xor%eax,%eax
400546:bf 68 06 40 00 mov $ 0x400668,%edi
40054b:48 8d 74 24 0c lea 0xc(%rsp),%rsi
400550:e8 e3 fe ff ff callq 400438< __ isoc99_scanf @ plt>
400555:8b 44 24 0c mov 0xc(%rsp),%eax
400559:85 c0 test%eax,%eax#这是相关部分
40055b:74 0c je 400569< ; main + 0x29>
40055d:bf 6b 06 40 00 mov $ 0x40066b,%edi
400562:31 c0 xor%eax,%eax
400564:e8 af feff ff callq 400418< printf @ plt>
400569:31 c0 xor%eax,%eax
40056b:48 83 c4 18 add $ 0x18,%rsp
40056f:c3 retq

对于gcc -O3分支。c

  0000000000400540 < main> ;: 
400540:48 83 ec 18 sub $ 0x18,%rsp
400544:31 c0 xor%eax,%eax
400546:bf 68 06 40 00 mov $ 0x400668, %edi
40054b:48 8d 74 24 0c lea 0xc(%rsp),%rsi
400550:e8 e3 feff ff callq 400438< __ isoc99_scanf @ plt>
400555:8b 44 24 0c mov 0xc(%rsp),%eax
400559:85 c0 test%eax,%eax
40055b:74 0c je 400569< main + 0x29>
40055d:bf 6b 06 40 00 mov $ 0x40066b,%edi
400562:31 c0 xor%eax,%eax
400564:e8 af feff ff callq 400418< printf @ plt>
400569:31 c0 xor%eax,%eax
40056b:48 83 c4 18 add $ 0x18,%rsp
40056f:c3 retq

这是完全相同的代码。



您链接到的问题用来衡量JAVAScript的性能。请注意,在两种情况下,它可能会被解释为不同的内容(因为Java脚本被解释为JIT或JIT取决于版本)。
无论如何,JavaScript并不是学习性能的最佳方法。


I stumbled upon this question/answer which mentions that in most languages, logical operators such as:

x == y && doSomething();

can be faster than doing the same thing with an if branch:

if(x == y) {
  doSomething();
}

Similarly, it says that the ternary operator:

x = y == z ? 0 : 1

is usually faster than using an if branch:

if(y == z) {
  x = 0;
} else {
  x = 1;
}

This got me Googling, which led me to this fantastic answer which explains branch prediction.

Basically, what it says is that the CPU operates at very fast speeds, and rather than slowing down to compute every if branch, it tries to guess what outcome will take place and places the appropriate instructions in its pipeline. But if it makes the wrong guess, it will have to back up and recompute the appropriate instructions.

But this still doesn't explain to me why logical operators or the ternary operator are treated differently than if branches. Since the CPU doesn't know the outcome of x == y, shouldn't it still have to guess whether to place the call to doSomething() (and therefore, all of doSomething's code) into its pipeline? And, therefore, back up if its guess was incorrect? Similarly, for the ternary operator, shouldn't the CPU have to guess whether y == z will evaluate to true when determining what to store in x, and back up if its guess was wrong?

I don't understand why if branches are treated any differently by the compiler than any other statement which is conditional. Shouldn't all conditionals be evaluated the same way?

解决方案

Short answer - it simply isn't. While helping branch prediction could improve you performance - using this as a part a logical statement doesn't change the compiled code. If you want to help branch prediction use __builtin_expect (for GNU)

To emphasize let's compare the compiler output:

#include <stdio.h>


int main(){
        int foo;

        scanf("%d", &foo); /*Needed to eliminate optimizations*/

#ifdef IF       
        if (foo)
                printf("Foo!");
#else
        foo &&  printf("Foo!");
#endif 
        return 0;
}

For gcc -O3 branch.c -DIF We get:

0000000000400540 <main>:
  400540:       48 83 ec 18             sub    $0x18,%rsp
  400544:       31 c0                   xor    %eax,%eax
  400546:       bf 68 06 40 00          mov    $0x400668,%edi
  40054b:       48 8d 74 24 0c          lea    0xc(%rsp),%rsi
  400550:       e8 e3 fe ff ff          callq  400438 <__isoc99_scanf@plt>
  400555:       8b 44 24 0c             mov    0xc(%rsp),%eax
  400559:       85 c0                   test   %eax,%eax #This is the relevant part
  40055b:       74 0c                   je     400569 <main+0x29>
  40055d:       bf 6b 06 40 00          mov    $0x40066b,%edi
  400562:       31 c0                   xor    %eax,%eax
  400564:       e8 af fe ff ff          callq  400418 <printf@plt>
  400569:       31 c0                   xor    %eax,%eax
  40056b:       48 83 c4 18             add    $0x18,%rsp
  40056f:       c3                      retq 

And for gcc -O3 branch.c

0000000000400540 <main>:
  400540:       48 83 ec 18             sub    $0x18,%rsp
  400544:       31 c0                   xor    %eax,%eax
  400546:       bf 68 06 40 00          mov    $0x400668,%edi
  40054b:       48 8d 74 24 0c          lea    0xc(%rsp),%rsi
  400550:       e8 e3 fe ff ff          callq  400438 <__isoc99_scanf@plt>
  400555:       8b 44 24 0c             mov    0xc(%rsp),%eax
  400559:       85 c0                   test   %eax,%eax
  40055b:       74 0c                   je     400569 <main+0x29>
  40055d:       bf 6b 06 40 00          mov    $0x40066b,%edi
  400562:       31 c0                   xor    %eax,%eax
  400564:       e8 af fe ff ff          callq  400418 <printf@plt>
  400569:       31 c0                   xor    %eax,%eax
  40056b:       48 83 c4 18             add    $0x18,%rsp
  40056f:       c3                      retq 

This is exactly the same code.

The question you linked to measures performance for JAVAScript. Note that there it may be interpreted (since Java script is interpreted or JIT depending on the version) to something different for the two cases. Anyway JavaScript is not the best for learning about performance.

这篇关于为什么三元和逻辑运算符比if分支更有效率?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆