GCC 4.4:避免在gcc中对switch/case语句进行范围检查? [英] GCC 4.4: Avoid range check on switch/case statement in gcc?

查看:175
本文介绍了GCC 4.4:避免在gcc中对switch/case语句进行范围检查?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这仅是针对4.4之前的GCC版本的问题,这已在GCC 4.5中修复.

是否可以告诉编译器开关中使用的变量适合提供的case语句?特别是如果范围很小,并且会生成一个跳转表.

extern int a;
main()
{
        switch (a & 0x7) {   // 0x7  == 111  values are 0-7
        case 0: f0(); break;
        case 1: f1(); break;
        case 2: f2(); break;
        case 3: f3(); break;
        case 4: f4(); break;
        case 5: f5(); break;
        case 6: f6(); break;
        case 7: f7(); break;
        }
}

我尝试使用枚举,使用gcc_unreachable()将xor'ing到低位(作为示例),但无济于事.生成的代码将始终检查变量是否在范围内,添加无条件分支条件并移走跳转表计算代码.

注意:这是在解码器的最内层循环中,性能至关重要.

似乎我不是 一个.

无法告诉gcc从不使用默认分支, 尽管如果可以证明 根据早期的条件检查,该值永远不会超出范围.

那么,您如何帮助gcc证明变量适合并且上面的示例中没有默认分支? (当然,不添加条件分支.)

更新

  1. 这是在带有GCC 4.2的OS X 10.6 Snow Leopard(Xcode的默认设置)上发生的.在Linux中的GCC 4.4/4.3上没有发生过(由Nathon和Jens Gustedt报告). li>

  2. 示例中的函数具有可读性,认为它们是内联的或仅是语句.在x86上进行函数调用非常昂贵.

    该示例(如注释中所述)还属于数据(大数据)循环内.

    使用gcc 4.2/OS X生成的代码为:

    [...]
    andl    $7, %eax
    cmpl    $7, %eax
    ja  L11
    mov %eax, %eax
    leaq    L20(%rip), %rdx
    movslq  (%rdx,%rax,4),%rax
    addq    %rdx, %rax
    jmp *%rax
    .align 2,0x90
    L20:
    .long   L12-L20
    .long   L13-L20
    .long   L14-L20
    .long   L15-L20
    .long   L16-L20
    .long   L17-L20
    .long   L18-L20
    .long   L19-L20
    L19:
    [...]
    

    问题出在cmp $7, %eax; ja L11;

  3. 好的,我要使用丑陋的解决方案,并为不低于4.4的gcc版本添加特殊情况,即使用其他版本而无需进行切换,并使用goto和gcc的&&& label扩展名.

    static void *jtb[] = { &&c_1, &&c_2, &&c_3, &&c_4, &&c_5, &&c_6, &&c_7, &&c_8 };
    [...]
    goto *jtb[a & 0x7];
    [...]
    while(0) {
    c_1:
    // something
    break;
    c_2:
    // something
    break;
    [...]
    }
    

    请注意,标签数组是静态的,因此不会在每次调用时都进行计算.

解决方案

我尝试编译一些简单且可与-O5和-fno-inline媲美的东西(我的f0-f7函数微不足道),并且它生成了以下内容:


 8048420:   55                      push   %ebp ;; function preamble
 8048421:   89 e5                   mov    %esp,%ebp ;; Yeah, yeah, it's a function.
 8048423:   83 ec 04                sub    $0x4,%esp ;; do stuff with the stack
 8048426:   8b 45 08                mov    0x8(%ebp),%eax ;; x86 sucks, we get it
 8048429:   83 e0 07                and    $0x7,%eax ;; Do the (a & 0x7)
 804842c:   ff 24 85 a0 85 04 08    jmp    *0x80485a0(,%eax,4) ;; Jump table!
 8048433:   90                      nop
 8048434:   8d 74 26 00             lea    0x0(%esi,%eiz,1),%esi
 8048438:   8d 45 08                lea    0x8(%ebp),%eax
 804843b:   89 04 24                mov    %eax,(%esp)
 804843e:   e8 bd ff ff ff          call   8048400 
 8048443:   8b 45 08                mov    0x8(%ebp),%eax
 8048446:   c9                      leave  

您尝试过优化级别吗?

This is only an issue on GCC versions prior to 4.4, this was fixed in GCC 4.5.

Is it possible to tell the compiler the variable used in a switch fits within the provided case statements? In particular if it's a small range and there's a jump table generated.

extern int a;
main()
{
        switch (a & 0x7) {   // 0x7  == 111  values are 0-7
        case 0: f0(); break;
        case 1: f1(); break;
        case 2: f2(); break;
        case 3: f3(); break;
        case 4: f4(); break;
        case 5: f5(); break;
        case 6: f6(); break;
        case 7: f7(); break;
        }
}

I tried xor'ing to low bits (as the example), using enums, using gcc_unreachable() to no avail. The generated code always checks if the variable is inside the range, adding a pointless branch conditional and moving away the jump table calculation code.

Note: this is in the innermost loop of a decoder, performance matters significantly.

It seems I'm not the only one.

There is no way to tell gcc that the default branch is never taken, although it will omit the default branch if it can prove that the value is never out of range based on earlier conditional checks.

So, how do you help gcc prove the variable fits and there's no default branch in the example above? (Without adding a conditional branch, of course.)

Updates

  1. This was on OS X 10.6 Snow Leopard with GCC 4.2 (default from Xcode.) It didn't happen with GCC 4.4/4.3 in linux (reported by Nathon and Jens Gustedt.)

  2. The functions in the example are there for readability, think those are inlined or just statements. Making a function call on x86 is expensive.

    Also the example, as mentioned in the note, belongs inside a loop on data (big data.)

    The generated code with gcc 4.2/OS X is:

    [...]
    andl    $7, %eax
    cmpl    $7, %eax
    ja  L11
    mov %eax, %eax
    leaq    L20(%rip), %rdx
    movslq  (%rdx,%rax,4),%rax
    addq    %rdx, %rax
    jmp *%rax
    .align 2,0x90
    L20:
    .long   L12-L20
    .long   L13-L20
    .long   L14-L20
    .long   L15-L20
    .long   L16-L20
    .long   L17-L20
    .long   L18-L20
    .long   L19-L20
    L19:
    [...]
    

    The problem lies on cmp $7, %eax; ja L11;

  3. OK, I'm going with the ugly solution and adding a special case for gcc versions below 4.4 using a different version without a switch and using goto and gcc's &&label extensions.

    static void *jtb[] = { &&c_1, &&c_2, &&c_3, &&c_4, &&c_5, &&c_6, &&c_7, &&c_8 };
    [...]
    goto *jtb[a & 0x7];
    [...]
    while(0) {
    c_1:
    // something
    break;
    c_2:
    // something
    break;
    [...]
    }
    

    Note the array of labels is static so it's not computed every call.

解决方案

I tried compiling something simple and comparable with -O5 and -fno-inline (my f0-f7 functions were trivial) and it generated this:


 8048420:   55                      push   %ebp ;; function preamble
 8048421:   89 e5                   mov    %esp,%ebp ;; Yeah, yeah, it's a function.
 8048423:   83 ec 04                sub    $0x4,%esp ;; do stuff with the stack
 8048426:   8b 45 08                mov    0x8(%ebp),%eax ;; x86 sucks, we get it
 8048429:   83 e0 07                and    $0x7,%eax ;; Do the (a & 0x7)
 804842c:   ff 24 85 a0 85 04 08    jmp    *0x80485a0(,%eax,4) ;; Jump table!
 8048433:   90                      nop
 8048434:   8d 74 26 00             lea    0x0(%esi,%eiz,1),%esi
 8048438:   8d 45 08                lea    0x8(%ebp),%eax
 804843b:   89 04 24                mov    %eax,(%esp)
 804843e:   e8 bd ff ff ff          call   8048400 
 8048443:   8b 45 08                mov    0x8(%ebp),%eax
 8048446:   c9                      leave  

Did you try playing with optimization levels?

这篇关于GCC 4.4:避免在gcc中对switch/case语句进行范围检查?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆