endian在c / c ++中的最优和可移植的转换 [英] Optimal and portable conversion of endian in c/c++
问题描述
给定一个32位的little-endian字段的二进制文件,我需要编写解析代码,独立于执行该代码的机器的字节顺序进行编译。目前我使用
Given a binary file with 32-bit little-endian fields that I need to parse, I want to write parsing code that compiles correctly independent of endianness of machine that executes that code. Currently I use
uint32_t fromLittleEndian(const char* data){
return uint32_t(data[3]) << (CHAR_BIT*3) |
uint32_t(data[2]) << (CHAR_BIT*2) |
uint32_t(data[1]) << CHAR_BIT |
data[0];
}
但是,生成非最佳汇编。在我的机器上 g ++ -O3 -S
产生:
this, however generate inoptimal assembly. On my machine g++ -O3 -S
produces:
_Z16fromLittleEndianPKc:
.LFB4:
.cfi_startproc
movsbl 3(%rdi), %eax
sall $24, %eax
movl %eax, %edx
movsbl 2(%rdi), %eax
sall $16, %eax
orl %edx, %eax
movsbl (%rdi), %edx
orl %edx, %eax
movsbl 1(%rdi), %edx
sall $8, %edx
orl %edx, %eax
ret
.cfi_endproc
为什么会发生这种情况?我如何说服它在小端机上编译时产生最佳代码:
why is this happening? How could I convince it to produce optimal code when compiled on little endian machines:
_Z17fromLittleEndian2PKc:
.LFB5:
.cfi_startproc
movl (%rdi), %eax
ret
.cfi_endproc
这是我通过编译得到的:
which I have gotten by compiling:
uint32_t fromLittleEndian2(const char* data){
return *reinterpret_cast<const uint32_t*>(data);
}
由于我知道我的机器是little-endian,最佳,但如果编译在big-endian机器上它将失败。它也违反严格混叠规则,所以如果内联,它可能产生UB甚至在小端机。是否有有效的代码,如果可能,将被编译为最佳汇编
Since I know my machine is little-endian, I know that above assembly is optimal, but it will fail if compiled on big-endian machine. It also violates strict-aliasing rules, so if inlined it might produce UB even on little endian machines. Is there a valid code that will be compiled to optimal assembly if possible?
由于我希望我的函数内联很多,的运行时端检测是不成问题的。编写最佳C / C ++代码的唯一替代方法是使用编译时间端序检测,并使用 template
s或 #define
s,如果目标endian不是little-endian,则回退到低效代码。
Since I expect my function to be inlined a lot, any kind of runtime endian detection is out of the question. The only alternative to writing optimal C/C++ code is to use compile time endian detection, and use template
s or #define
s to fall back to the inefficient code if target endian is not little-endian. This however seems to be quite difficult to be done portably.
推荐答案
我知道的各种平台库都是通过#defining宏来实现的对于endian交换例程,基于#define BIG_ENDIAN的值。在源字幕顺序与您的目标字幕顺序匹配的情况下,您可以:
Various platform libraries that I know of do this by #defining macros for the endian-swapping routines, based on the value of #define BIG_ENDIAN. In the cases where the source endianness matches your target endianness, you can just:
#ifdef LITTLE_ENDIAN
#define fromLittleEndian(x) (x)
#else
#define fromLittleEndian(x) _actuallySwapLittle((x))
#endif
例如:
http://man7.org/linux/man-pages/man3/endian.3.html
http://fxr.watson.org/fxr/source/sys /endian.h
这篇关于endian在c / c ++中的最优和可移植的转换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!