endian在c / c ++中的最优和可移植的转换 [英] Optimal and portable conversion of endian in c/c++

查看:338
本文介绍了endian在c / c ++中的最优和可移植的转换的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

给定一个32位的little-endian字段的二进制文件,我需要编写解析代码,独立于执行该代码的机器的字节顺序进行编译。目前我使用

Given a binary file with 32-bit little-endian fields that I need to parse, I want to write parsing code that compiles correctly independent of endianness of machine that executes that code. Currently I use

uint32_t fromLittleEndian(const char* data){
  return uint32_t(data[3]) << (CHAR_BIT*3) |
         uint32_t(data[2]) << (CHAR_BIT*2) |
         uint32_t(data[1]) << CHAR_BIT |
         data[0]; 
}

但是,生成非最佳汇编。在我的机器上 g ++ -O3 -S 产生:

this, however generate inoptimal assembly. On my machine g++ -O3 -S produces:

_Z16fromLittleEndianPKc:
.LFB4:
    .cfi_startproc
    movsbl  3(%rdi), %eax
    sall    $24, %eax
    movl    %eax, %edx
    movsbl  2(%rdi), %eax
    sall    $16, %eax
    orl %edx, %eax
    movsbl  (%rdi), %edx
    orl %edx, %eax
    movsbl  1(%rdi), %edx
    sall    $8, %edx
    orl %edx, %eax
    ret
    .cfi_endproc

为什么会发生这种情况?我如何说服它在小端机上编译时产生最佳代码

why is this happening? How could I convince it to produce optimal code when compiled on little endian machines:

_Z17fromLittleEndian2PKc:
.LFB5:
    .cfi_startproc
    movl    (%rdi), %eax
    ret
    .cfi_endproc

这是我通过编译得到的:

which I have gotten by compiling:

uint32_t fromLittleEndian2(const char* data){
    return *reinterpret_cast<const uint32_t*>(data);
}

由于我知道我的机器是little-endian,最佳,但如果编译在big-endian机器上它将失败。它也违反严格混叠规则,所以如果内联,它可能产生UB甚至在小端机。是否有有效的代码,如果可能,将被编译为最佳汇编

Since I know my machine is little-endian, I know that above assembly is optimal, but it will fail if compiled on big-endian machine. It also violates strict-aliasing rules, so if inlined it might produce UB even on little endian machines. Is there a valid code that will be compiled to optimal assembly if possible?

由于我希望我的函数内联很多,的运行时端检测是不成问题的。编写最佳C / C ++代码的唯一替代方法是使用编译时间端序检测,并使用 template s或 #define s,如果目标endian不是little-endian,则回退到低效代码。

Since I expect my function to be inlined a lot, any kind of runtime endian detection is out of the question. The only alternative to writing optimal C/C++ code is to use compile time endian detection, and use templates or #defines to fall back to the inefficient code if target endian is not little-endian. This however seems to be quite difficult to be done portably.

推荐答案

我知道的各种平台库都是通过#defining宏来实现的对于endian交换例程,基于#define BIG_ENDIAN的值。在源字幕顺序与您的目标字幕顺序匹配的情况下,您可以:

Various platform libraries that I know of do this by #defining macros for the endian-swapping routines, based on the value of #define BIG_ENDIAN. In the cases where the source endianness matches your target endianness, you can just:

#ifdef LITTLE_ENDIAN
    #define fromLittleEndian(x) (x)
#else
    #define fromLittleEndian(x) _actuallySwapLittle((x))
#endif

例如:

http://man7.org/linux/man-pages/man3/endian.3.html

http://fxr.watson.org/fxr/source/sys /endian.h

这篇关于endian在c / c ++中的最优和可移植的转换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆