我可以使用GCC的__builtin_expect()在C三元运算符 [英] Can I use GCC's __builtin_expect() with ternary operator in C

查看:323
本文介绍了我可以使用GCC的__builtin_expect()在C三元运算符的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

借助 GCC手册只显示了__builtin_expect()将被置于整个状态的例子一个如果的语句。

我也注意到,GCC不抱怨,如果我使用它,例如,用一个三元运算符,或任意组成的前pression对于这个问题,即使是未在分支环境中使用。

所以,我不知道其使用的基本约束条件其实都是。

在三元操作像这样使用时,将保留其效果:

  INT富(int i)以
{
  返回__builtin_expect(ⅰ== 7,1)? 100:200;
}

又是怎么回事这种情况:

  INT富(int i)以
{
  返回__builtin_expect(ⅰ,7)== 7? 100:200;
}

而这其中:

  INT富(int i)以
{
  诠释J = __builtin_expect(I,7);
  复位J == 7? 100:200;
}


解决方案

它显然同时适用于三元和常规if语句。

首先,让我们来看看以下三个code样品,其中两个用 __ builtin_expect 在两个常规,如果和三元如果款式和第三,不使用它。

builtin.c:

  INT的main()
{
    焦C =的getchar();
    为const char * printVal;
    如果(__builtin_expect(三=='C',1))
    {
        printVal =发生预期的分支\\ N!
    }
    其他
    {
        printVal =!嘘\\ N的;
    }    的printf(printVal);
}

ternary.c:

  INT的main()
{
    焦C =的getchar();
    为const char * printVal = __builtin_expect(C =='C',1)
        ? 发生预期的分支!\\ n
        :嘘\\ N的;    的printf(printVal);
}

nobuiltin.c:

  INT的main()
{
    焦C =的getchar();
    为const char * printVal;
    如果(C =='C')
    {
        printVal =发生预期的分支\\ N!
    }
    其他
    {
        printVal =!嘘\\ N的;
    }    的printf(printVal);
}

在与 -O3 编制,所有这三个结果同一程序。然而,当 -O 为空(上GCC 4.7.2),两者ternary.c和builtin.c具有相同的组装上市(它事项):

builtin.s:

  .filebuiltin.c
    .section伪.RODATA
.LC0:
    .string发生预期的分支!\\ n
.LC1:
    .string嘘!\\ n
    。文本
    .globl主
    .TYPE为主,@function
主要:
.LFB0:
    .cfi_startproc
    pushl%EBP
    .cfi_def_cfa_offset 8
    .cfi_offset 5,-8
    MOVL%ESP,EBP%
    .cfi_def_cfa_register 5
    和L $ -16,ESP%
    subl $ 32%ESP
    打电话的getchar
    MOVB%人,27(%ESP)
    CMPB $ 99 27(%ESP)
    SETE%人
    movzbl%人,EAX%
    为test1%EAX,EAX%
    JE .L2
    MOVL $ .LC0,28(%ESP)
    JMP .L3
.L2:
    MOVL $ .LC1,28(%ESP)
.L3:
    MOVL 28(%ESP),EAX%
    MOVL%EAX(%ESP)
    调用printf
    离开
    .cfi_restore 5
    .cfi_def_cfa 4,4
    RET
    .cfi_endproc
.LFE0:
    .size为主,。,主
    .identGCC:(Debian的4.7.2-4)4.7.2
    .section伪.note.GNU堆栈,,@ PROGBITS

ternary.s:

  .fileternary.c
    .section伪.RODATA
.LC0:
    .string发生预期的分支!\\ n
.LC1:
    .string嘘!\\ n
    。文本
    .globl主
    .TYPE为主,@function
主要:
.LFB0:
    .cfi_startproc
    pushl%EBP
    .cfi_def_cfa_offset 8
    .cfi_offset 5,-8
    MOVL%ESP,EBP%
    .cfi_def_cfa_register 5
    和L $ -16,ESP%
    subl $ 32%ESP
    打电话的getchar
    MOVB%人,31(%尤)
    CMPB $ 99 31(%ESP)
    SETE%人
    movzbl%人,EAX%
    为test1%EAX,EAX%
    JE .L2
    MOVL $ .LC0,EAX%
    JMP .L3
.L2:
    MOVL $ .LC1,EAX%
.L3:
    MOVL%eax中,24(%ESP)
    MOVL 24(%ESP),EAX%
    MOVL%EAX(%ESP)
    调用printf
    离开
    .cfi_restore 5
    .cfi_def_cfa 4,4
    RET
    .cfi_endproc
.LFE0:
    .size为主,。,主
    .identGCC:(Debian的4.7.2-4)4.7.2
    .section伪.note.GNU堆栈,,@ PROGBITS

而nobuiltin.c不会:

  .filenobuiltin.c
    .section伪.RODATA
.LC0:
    .string发生预期的分支!\\ n
.LC1:
    .string嘘!\\ n
    。文本
    .globl主
    .TYPE为主,@function
主要:
.LFB0:
    .cfi_startproc
    pushl%EBP
    .cfi_def_cfa_offset 8
    .cfi_offset 5,-8
    MOVL%ESP,EBP%
    .cfi_def_cfa_register 5
    和L $ -16,ESP%
    subl $ 32%ESP
    打电话的getchar
    MOVB%人,27(%ESP)
    CMPB $ 99 27(%ESP)
    JNE .L2
    MOVL $ .LC0,28(%ESP)
    JMP .L3
.L2:
    MOVL $ .LC1,28(%ESP)
.L3:
    MOVL 28(%ESP),EAX%
    MOVL%EAX(%ESP)
    调用printf
    离开
    .cfi_restore 5
    .cfi_def_cfa 4,4
    RET
    .cfi_endproc
.LFE0:
    .size为主,。,主
    .identGCC:(Debian的4.7.2-4)4.7.2
    .section伪.note.GNU堆栈,,@ PROGBITS

相关部分:

基本上, __ builtin_expect 造成额外的code( SETE%人 ...)被执行之前在 JE .L2 根据为test1%EAX,EAX%与CPU更可能$ p的结果$ pdict作为是1(天真的假设,在这里),而不是基于输入字符的带直接比较'c'的。而在nobuiltin.c情况下,没有这样的code存在且 JE / JNE 直接沿用了比较以'C'( CMP $ 99个)。请记住,分支prediction主要是做CPU中,这里GCC简直是奠定了陷阱,为CPU的分支predictor承担哪条路径将采取(通过额外的code和开关的 JE JNE ,虽然我没有这个来源,如英特尔的的官方优化手册没有提到治疗的第一接触与 JE VS JNE 不同分支prediction!我只能假设海合会队通过试验得出这个和错误)。

我相信有更好的测试情况下,海湾合作委员会的分支prediction可以更直接看到(而不是观察提示到CPU),虽然我不知道如何来模拟这种情况下简洁/简洁。 (猜:它会在编译过程中可能涉及循环展开)

The GCC manual only shows examples where __builtin_expect() is placed around the entire condition of an 'if' statement.

I also noticed that GCC does not complain if I use it, for example, with a ternary operator, or in any arbitrary integral expression for that matter, even one that is not used in a branching context.

So, I wonder what the underlying constraints of its usage actually are.

Will it retain its effect when used in a ternary operation like this:

int foo(int i)
{
  return __builtin_expect(i == 7, 1) ? 100 : 200;
}

And what about this case:

int foo(int i)
{
  return __builtin_expect(i, 7) == 7 ? 100 : 200;
}

And this one:

int foo(int i)
{
  int j = __builtin_expect(i, 7);
  return j == 7 ? 100 : 200;
}

解决方案

It apparently works for both ternary and regular if statements.

First, let's take a look at the following three code samples, two of which use __builtin_expect in both regular-if and ternary-if styles, and a third which does not use it at all.

builtin.c:

int main()
{
    char c = getchar();
    const char *printVal;
    if (__builtin_expect(c == 'c', 1))
    {
        printVal = "Took expected branch!\n";
    }
    else
    {
        printVal = "Boo!\n";
    }

    printf(printVal);
}

ternary.c:

int main()
{
    char c = getchar();
    const char *printVal = __builtin_expect(c == 'c', 1) 
        ? "Took expected branch!\n"
        : "Boo!\n";

    printf(printVal);
}

nobuiltin.c:

int main()
{
    char c = getchar();
    const char *printVal;
    if (c == 'c')
    {
        printVal = "Took expected branch!\n";
    }
    else
    {
        printVal = "Boo!\n";
    }

    printf(printVal);
}

When compiled with -O3, all three result in the same assembly. However, when the -O is left out (on GCC 4.7.2), both ternary.c and builtin.c have the same assembly listing (where it matters):

builtin.s:

    .file   "builtin.c"
    .section    .rodata
.LC0:
    .string "Took expected branch!\n"
.LC1:
    .string "Boo!\n"
    .text
    .globl  main
    .type   main, @function
main:
.LFB0:
    .cfi_startproc
    pushl   %ebp
    .cfi_def_cfa_offset 8
    .cfi_offset 5, -8
    movl    %esp, %ebp
    .cfi_def_cfa_register 5
    andl    $-16, %esp
    subl    $32, %esp
    call    getchar
    movb    %al, 27(%esp)
    cmpb    $99, 27(%esp)
    sete    %al
    movzbl  %al, %eax
    testl   %eax, %eax
    je  .L2
    movl    $.LC0, 28(%esp)
    jmp .L3
.L2:
    movl    $.LC1, 28(%esp)
.L3:
    movl    28(%esp), %eax
    movl    %eax, (%esp)
    call    printf
    leave
    .cfi_restore 5
    .cfi_def_cfa 4, 4
    ret
    .cfi_endproc
.LFE0:
    .size   main, .-main
    .ident  "GCC: (Debian 4.7.2-4) 4.7.2"
    .section    .note.GNU-stack,"",@progbits

ternary.s:

    .file   "ternary.c"
    .section    .rodata
.LC0:
    .string "Took expected branch!\n"
.LC1:
    .string "Boo!\n"
    .text
    .globl  main
    .type   main, @function
main:
.LFB0:
    .cfi_startproc
    pushl   %ebp
    .cfi_def_cfa_offset 8
    .cfi_offset 5, -8
    movl    %esp, %ebp
    .cfi_def_cfa_register 5
    andl    $-16, %esp
    subl    $32, %esp
    call    getchar
    movb    %al, 31(%esp)
    cmpb    $99, 31(%esp)
    sete    %al
    movzbl  %al, %eax
    testl   %eax, %eax
    je  .L2
    movl    $.LC0, %eax
    jmp .L3
.L2:
    movl    $.LC1, %eax
.L3:
    movl    %eax, 24(%esp)
    movl    24(%esp), %eax
    movl    %eax, (%esp)
    call    printf
    leave
    .cfi_restore 5
    .cfi_def_cfa 4, 4
    ret
    .cfi_endproc
.LFE0:
    .size   main, .-main
    .ident  "GCC: (Debian 4.7.2-4) 4.7.2"
    .section    .note.GNU-stack,"",@progbits

Whereas nobuiltin.c does not:

    .file   "nobuiltin.c"
    .section    .rodata
.LC0:
    .string "Took expected branch!\n"
.LC1:
    .string "Boo!\n"
    .text
    .globl  main
    .type   main, @function
main:
.LFB0:
    .cfi_startproc
    pushl   %ebp
    .cfi_def_cfa_offset 8
    .cfi_offset 5, -8
    movl    %esp, %ebp
    .cfi_def_cfa_register 5
    andl    $-16, %esp
    subl    $32, %esp
    call    getchar
    movb    %al, 27(%esp)
    cmpb    $99, 27(%esp)
    jne .L2
    movl    $.LC0, 28(%esp)
    jmp .L3
.L2:
    movl    $.LC1, 28(%esp)
.L3:
    movl    28(%esp), %eax
    movl    %eax, (%esp)
    call    printf
    leave
    .cfi_restore 5
    .cfi_def_cfa 4, 4
    ret
    .cfi_endproc
.LFE0:
    .size   main, .-main
    .ident  "GCC: (Debian 4.7.2-4) 4.7.2"
    .section    .note.GNU-stack,"",@progbits

The relevant part:

Basically, __builtin_expect causes extra code (sete %al...) to be executed before the je .L2 based on the outcome of testl %eax, %eax which the CPU is more likely to predict as being 1 (naive assumption, here) instead of based on the direct comparison of the input char with 'c'. Whereas in the nobuiltin.c case, no such code exists and the je/jne directly follows the comparison with 'c' (cmp $99). Remember, branch prediction is mainly done in the CPU, and here GCC is simply "laying a trap" for the CPU branch predictor to assume which path will be taken (via the extra code and the switching of je and jne, though I do not have a source for this, as Intel's official optimization manual does not mention treating first-encounters with je vs jne differently for branch prediction! I can only assume the GCC team arrived at this via trial and error).

I am sure there are better test cases where GCC's branch prediction can be seen more directly (instead of observing hints to the CPU), though I do not know how to emulate such a case succinctly/concisely. (Guess: it would likely involve loop unrolling during compilation.)

这篇关于我可以使用GCC的__builtin_expect()在C三元运算符的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆