最有效的标准兼容的方式重新解释int作为浮动 [英] Most efficient standard-compliant way of reinterpreting int as float

查看:134
本文介绍了最有效的标准兼容的方式重新解释int作为浮动的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我保证 float 是IEEE 754 binary32。给定一个对应于有效浮点的位模式,存储在 std :: uint32_t 中,如何将它重新解释为 float

  float reinterpret_as_float(std :: uint32_t ui){
return / *将巫术应用于ui * /;
}



我有几种方法我知道/怀疑/问题:


  1. 通过 reinterpret_cast

      float reinterpret_as_float(std :: uint32_t ui){
    return reinterpret_cast< float&>(ui);
    }

    或等同地

      float reinterpret_as_float(std :: uint32_t ui){
    return * reinterpret_cast< float *>(& ui);
    }

    遇到别名问题。


  2. 通过联合

      float reinterpret_as_float (std :: uint32_t ui){
    union {
    std :: uint32_t ui;
    float f;
    } u = {ui};
    return u.f;
    }

    这实际上不合法,因为它只允许从最近写给成员。


  3. Via std :: memcpy / p>

      float reinterpret_as_float(std :: uint32_t ui){
    float f;
    std :: memcpy(& f,& ui,4);
    return f; AFAIK是合法的,但是一个函数调用复制单个单词似乎是浪费的,但是这个函数调用了
    }



    <


  4. 通过 reinterpret_cast 转到 char * 并复制,

      float reinterpret_as_float(std :: uint32_t ui){
    char * uip = reinterpret_cast< char *>(& ui);
    float f;
    char * fp = reinterpret_cast< char *>(& f);
    for(int i = 0; i <4; ++ i){
    fp [i] = uip [i];
    }
    return f;
    }

    其中AFAIK也是合法的, char 指针免除了别名问题,手动字节复制循环保存了一个可能的函数调用。该循环将绝对展开,但4个可能单独的单字节加载/存储令人担忧,我不知道这是否可优化到单个四字节加载/存储。


4 是我能够想出的最好的。



我到目前为止是否正确?有没有更好的方法来做这个,特别是一个将保证单个加载/存储?

解决方案

Afaik,符合严格别名规则的方法: memcpy(),并通过复制转换为 char * 。所有其他人从属于 uint32_t 的内存中读取一个 float ,并允许编译器在写入该内存位置。它甚至可以优化完全写入,因为它可以证明存储的值永远不会根据严格的别名规则使用,导致一个垃圾回收值。



取决于编译器/优化 memcpy() char * 复制是否更快。在这两种情况下,一个智能编译器可能能够弄清楚它只是加载并复制一个 uint32_t ,但我不会相信任何编译器这样做之前,我已经看到

编辑:

在使用gcc 4.8.1进行一些测试之后,我可以说 memcpy()方法对于此特定编译器是最好的,详情见下文。






编译

  #include< stdint.h> 

float foo(uint32_t a){
float b;
char * aPointer =(char *)& a,* bPointer =(char *)& b;
for(int i = sizeof(a); i--;)bPointer [i] = aPointer [i];
return b;
}

gcc -S -std = gnu11 -O3 foo.c 产生此汇编代码:

  movl%edi,%ecx 
movl %edi,%edx
movl%edi,%eax
shrl $ 24,%ecx
shrl $ 16,%edx
shrw $ 8,%ax
movb%cl, -1(%rsp)
movb%dl,-2(%rsp)
movb%al,-3(%rsp)
movb%dil,-4 $ b movss -4(%rsp),%xmm0
ret



相同

  #include< stdint。 h。 
#include< string.h>

float foo(uint32_t a){
float b;
char * aPointer =(char *)& a,* bPointer =(char *)& b;
memcpy(bPointer,aPointer,sizeof(a));
return b;
}

/ code>):

  movl%edi,-4(%rsp)
movss -4 %rmm),%xmm0
ret

这是最佳的。


Assume I have guarantees that float is IEEE 754 binary32. Given a bit pattern that corresponds to a valid float, stored in std::uint32_t, how does one reinterpret it as a float in a most efficient standard compliant way?

float reinterpret_as_float(std::uint32_t ui) {
   return /* apply sorcery to ui */;
}

I've got a few ways that I know/suspect/assume have some issues:

  1. Via reinterpret_cast,

    float reinterpret_as_float(std::uint32_t ui) {
        return reinterpret_cast<float&>(ui);
    }
    

    or equvalently

    float reinterpret_as_float(std::uint32_t ui) {
        return *reinterpret_cast<float*>(&ui);
    }
    

    which suffers from aliasing issues.

  2. Via union,

    float reinterpret_as_float(std::uint32_t ui) {
        union {
            std::uint32_t ui;
            float f;
        } u = {ui};
        return u.f;
    }
    

    which is not actually legal, as it is only allowed to read from most recently written to member. Yet, it seems some compilers (gcc) allow this.

  3. Via std::memcpy,

    float reinterpret_as_float(std::uint32_t ui) {
        float f;
        std::memcpy(&f, &ui, 4);
        return f;
    }
    

    which AFAIK is legal, but a function call to copy single word seems wasteful, though it might get optimized away.

  4. Via reinterpret_casting to char* and copying,

    float reinterpret_as_float(std::uint32_t ui) {
        char* uip = reinterpret_cast<char*>(&ui);
        float f;
        char* fp = reinterpret_cast<char*>(&f);
        for (int i = 0; i < 4; ++i) {
            fp[i] = uip[i];
        }
        return f;
    }
    

    which AFAIK is also legal, as char pointers are exempt from aliasing issues and manual byte copying loop saves a possible function call. The loop will most definitely be unrolled, yet 4 possibly separate one-byte loads/stores are worrisome, I have no idea whether this is optimizable to single four byte load/store.

The 4 is the best I've been able to come up with.

Am I correct so far? Is there a better way to do this, particulary one that will guarantee single load/store?

解决方案

Afaik, there are only two approaches that are compliant with strict aliasing rules: memcpy() and cast to char* with copying. All others read a float from memory that belongs to an uint32_t, and the compiler is allowed to perform the read before the write to that memory location. It might even optimize away the write altogether as it can prove that the stored value will never be used according to strict aliasing rules, resulting in a garbage return value.

It really depends on the compiler/optimizes whether memcpy() or char* copy is faster. In both cases, an intelligent compiler might be able to figure out that it can just load and copy an uint32_t, but I would not trust any compiler to do so before I have seen it in the resulting assembler code.

Edit:
After some testing with gcc 4.8.1, I can say that the memcpy() approach is the best for this particulare compiler, see below for details.


Compiling

#include <stdint.h>

float foo(uint32_t a) {
    float b;
    char* aPointer = (char*)&a, *bPointer = (char*)&b;
    for( int i = sizeof(a); i--; ) bPointer[i] = aPointer[i];
    return b;
}

with gcc -S -std=gnu11 -O3 foo.c yields this assemble code:

movl    %edi, %ecx
movl    %edi, %edx
movl    %edi, %eax
shrl    $24, %ecx
shrl    $16, %edx
shrw    $8, %ax
movb    %cl, -1(%rsp)
movb    %dl, -2(%rsp)
movb    %al, -3(%rsp)
movb    %dil, -4(%rsp)
movss   -4(%rsp), %xmm0
ret

This is not optimal.

Doing the same with

#include <stdint.h>
#include <string.h>

float foo(uint32_t a) {
    float b;
    char* aPointer = (char*)&a, *bPointer = (char*)&b;
    memcpy(bPointer, aPointer, sizeof(a));
    return b;
}

yields (with all optimization levels except -O0):

movl    %edi, -4(%rsp)
movss   -4(%rsp), %xmm0
ret

This is optimal.

这篇关于最有效的标准兼容的方式重新解释int作为浮动的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆