如何让GCC产生无内建的大端存储BSWAP指令? [英] How to make GCC generate bswap instruction for big endian store without builtins?

查看:321
本文介绍了如何让GCC产生无内建的大端存储BSWAP指令?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我工作的一个存储64位的值到大端格式存储功能。我希望我能写的便携式C99 code这两个小和大型平台的作品,并有现代的x86编译器生成一个 BSWAP 指令自动没有任何内建或内在。于是我开始用下面的函数:

I'm working on a function that stores a 64-bit value into memory in big endian format. I was hoping that I could write portable C99 code that works on both little and big endian platforms and have modern x86 compilers generate a bswap instruction automatically without any builtins or intrinsics. So I started with the following function:

#include <stdint.h>

void
encode_bigend_u64(uint64_t value, void *vdest) {
    uint64_t bigend;
    uint8_t *bytes = (uint8_t*)&bigend;
    bytes[0] = value >> 56;
    bytes[1] = value >> 48;
    bytes[2] = value >> 40;
    bytes[3] = value >> 32;
    bytes[4] = value >> 24;
    bytes[5] = value >> 16;
    bytes[6] = value >> 8;
    bytes[7] = value;
    uint64_t *dest = (uint64_t*)vdest;
    *dest = bigend;
}

这对于铛其编译此功能工作正常:

This works fine for clang which compiles this function to:

bswapq  %rdi
movq    %rdi, (%rsi)
retq

但GCC <一个href=\"https://gcc.godbolt.org/#compilers:!((compiler:g492,options:'-O2',sourcez:MQSwdgxgNgrgJgUwAQB4DOAXO4MDoAWAfAFDEBuA9iHMQpBYgPoBGIA5nXIzAGwAsAChg5%2BjDEjIBDWAgA0EqnCQAqMokwBKJAG9iSfUmFgMo8aw5g4Abj0GjGABxiVzAJ4YEaJAF4kQnE4YyhoAZOacNgZIbh5oANoADAC6PhLSMMiEhEgArDyRBjGecQCMKb5SMkhZSHwOBfpF8QBM5WlVNXwJDdHuxQDMbZUZ1dn9zT1NcXxD6ZnZzXyTffE5sx3ZJfm2jStxPOsjNfU7vbFxAOyHCD32pirq4r7%2BxqbBap4YPcqPqeGWNgAvsQgA)),filterAsm:(commentOnly:!t,directives:!t,labels:!t),version:3\"相对=nofollow>无法检测字节交换。我尝试了几个不同的方法,但他们只把事情弄得更糟。我知道,GCC可以用按位与,移位和按位或字节检测掉期,但为什么写字节时不工作?

But GCC fails to detect the byte swap. I tried a couple of different approaches but they only made things worse. I know that GCC can detect byte swaps using bitwise-and, shift, and bitwise-or, but why doesn't it work when writing bytes?

推荐答案

这似乎这样的伎俩:

void encode_bigend_u64(uint64_t value, void* dest)
{
  *(uint64_t*)dest =
      ((value & 0xFF00000000000000u) >> 56u) |
      ((value & 0x00FF000000000000u) >> 40u) |
      ((value & 0x0000FF0000000000u) >> 24u) |
      ((value & 0x000000FF00000000u) >>  8u) |
      ((value & 0x00000000FF000000u) <<  8u) |      
      ((value & 0x0000000000FF0000u) << 24u) |
      ((value & 0x000000000000FF00u) << 40u) |
      ((value & 0x00000000000000FFu) << 56u);
}

铛与 -O3

encode_bigend_u64(unsigned long, void*):
        bswapq  %rdi
        movq    %rdi, (%rsi)
        retq

铛与 -O3 -march =本地

encode_bigend_u64(unsigned long, void*):
        movbeq  %rdi, (%rsi)
        retq

GCC与 -O3

encode_bigend_u64(unsigned long, void*):
        bswap   %rdi
        movq    %rdi, (%rsi)
        ret

GCC与 -O3 -march =本地

encode_bigend_u64(unsigned long, void*):
        movbe   %rdi, (%rsi)
        ret


http://gcc.godbolt.org/(所以我不知道到底处理器是什么下(对于 -march =本地),但我强烈怀疑最近x86_64的处理器)


Tested with clang 3.8.0 and gcc 5.3.0 on http://gcc.godbolt.org/ (so I don't know exactly what processor is underneath (for the -march=native) but I strongly suspect a recent x86_64 processor)

如果你想这对于大端架构太工作的功能,你可以使用从<一个答案href=\"http://stackoverflow.com/questions/1001307/detecting-endianness-programmatically-in-a-c-program\">here检测到系统的字节序并添加如果。无论是工会和指针蒙上版本一起使用且由 GCC优化导致在详细同一个程序集(无分支)。在godebolt 全部code:

If you want a function which works for big endian architectures too, you can use the answers from here to detect the endianness of the system and add an if. Both the union and the pointer casts versions work and are optimized by both gcc and clang resulting in the exact same assembly (no branches). Full code on godebolt:

int is_big_endian(void)
{
    union {
        uint32_t i;
        char c[4];
    } bint = {0x01020304};

    return bint.c[0] == 1;
}

void encode_bigend_u64_union(uint64_t value, void* dest)
{
  if (!is_big_endian())
    //...
  else
    *(uint64_t*)dest = value;
}

void encode_bigend_u64_ptr_cast(uint64_t value, void* dest)
{
  const uint16_t endian_test = 1;
  if (*(uint8_t*)(&endian_test) == 1)
    //..
  else
    *(uint64_t*)dest = value;
}


<一个href=\"http://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-instruction-set-reference-manual-325383.pdf\"相对=nofollow>英特尔®64和IA-32架构指令集(3-542卷2A):


Intel® 64 and IA-32 Architectures Instruction Set Reference (3-542 Vol. 2A):

MOVBE,移动数据交换后字节

上执行从所述第二复制的数据的一个字节交换操作
  操作数(源操作数),结果存储到第一个操作数
  (目标操作数)。 [...]

Performs a byte swap operation on the data copied from the second operand (source operand) and store the result in the first operand (destination operand). [...]

该MOVBE指令提供了交换上读取的字节数
  从存储器或到存储器的写入;从而提供支持
  转换little-endian的值big-endian格式,反之亦然。

The MOVBE instruction is provided for swapping the bytes on a read from memory or on a write to memory; thus providing support for converting little-endian values to big-endian format and vice versa.

这篇关于如何让GCC产生无内建的大端存储BSWAP指令?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆