C语言的流行性检测和性能 [英] endianess detection and performance in C

查看:92
本文介绍了C语言的流行性检测和性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个关键性能的C代码,需要在各种平台上工作。

I've got a performance critical C code that needs to work on a variety of platform. Some of them are little endian, others are big endian.

基于宏检测,检测字节序目前是一个不完善的过程。但是很难确定宏检测对系统和编译器的所有组合都适用。欢迎来到可移植代码的世界。

Detecting endianess is currently an imperfect process, based on macro detection. But it's difficult to be sure that the macro detection will work for all combinations of systems and compilers. Welcome to the world of portable code.

一种比较安全的检测字节序的方法是使用运行时测试,并希望编译器对其进行优化。遵循这些原则:

One relatively safe way to detect endianess is to use a runtime test, and hope that it will get optimized out by the compiler. Something along these lines :

static const int one = 1;
#define IS_LITTLE_ENDIAN (*(char*)(&one))

有用。编译器应正确地检测到该宏的结果对于给定的体系结构始终是相同的(对于小字节序为1,对于大字节序为0),只需完全删除内存访问和关联的分支。

In general it works. The compiler should properly detect that the result of this macro is always the same for a given architecture (1 for little endian, 0 for big endian), and simply remove the memory access and associated branch altogether.

我的问题是:总是这样吗?我们可以期望编译器始终正确理解该测试,并始终正确对其进行优化吗? (假设-O2 / -O3或同等的优化级别,当然不适用于调试代码)

My question is : is it always the case ? Can we expect the compiler to always properly understand this test, and always optimize it correctly ? (assuming -O2/-O3 or equivalent optimization level, not applicable of course to debug code)

我特别担心双端CPU strong>,例如ARM。由于此类CPU可以是大字节序或小字节序,具体取决于OS参数,因此编译器可能很难硬连线这种字节序测试。另一方面,我不希望应用程序在两种字节序选择方式下运行:我猜应该为一个精确且确定的字节序编译该应用程序。因此,IS_LITTLE_ENDIAN的结果应该始终相同。

I'm especially worried for bi-endian CPU, like for example ARM. Since such CPU can be either big endian or little endian, depending on OS parameters, it might be difficult for the compiler to "hardwire" such endian test. On the other hand, I don't expect an application to work in "either endian mode of choice" : I guess it should be compiled for one precise and definitive endianess. Therefore, IS_LITTLE_ENDIAN should always result the same.

无论如何,我想询问遇到这种情况的人的经历。由于我目前没有使用双端CPU和编译器,因此无法测试和观察上述假设。

Anyway, I'm asking for experience of people having met such situation. Since I don't have a bi-endian CPU and compiler to play with currently, I'm not in position to test and observe above assumption.

[编辑]
@Brandin建议保存结果宏,使其成为变量。我猜他提议这样的事情:

[Edit] @Brandin proposes to "save the result" of the macro, making it a variable. I guess he proposes something like this :

static const int one = 1;
static const int isLittleEndian = *(char*)(&one);

由于在编译时对静态const int进行了评估,因此它确实可以保证编译器必须知道isLittleEndian的值,因此可以适当地优化使用此变量的分支。

Since a static const int is evaluated at compile time, it would indeed guarantee that the compiler is necessarily aware of the value of isLittleEndian, and can therefore properly optimize branches which use this variable.

不幸的是,它不起作用。
上面的声明导致以下编译错误:

Unfortunately, it doesn't work. The above declaration result in the following compilation error :

error: initializer element is not constant

我想是因为& one(指针地址)无法在编译时求值。

I guess that's because &one (a pointer address) cannot be evaluated at compile time.

@HuStmpHrrr的变体,改用union,看起来更好:没有要评估的指针地址。
不幸的是,它不能更好地工作,并且导致相同的编译错误。

@HuStmpHrrr's variant, using union instead, looks better : there is no pointer address to evaluate. Unfortunately, it doesn't work better, and results in the same compilation error.

我想这是因为编译器认为并集不够简单,可以用作静态const初始化的值。

I guess that's because unions are not considered simple enough by the compiler to be usable as a value for a static const initialization.

所以我们回到开头,而是使用宏。

So we're back to the beginning, with a macro instead.

推荐答案

相同的想法,但不同的技巧。

The same idea but different trick. this code will work too.

union {
  int num;
  char[sizeof(int)] bytes;
} endian;

endian.num = 1;

然后使用 endian.bytes [0] 来判断。

这样,事情变得更自然了,应该期望编译器做一些事情,因为这很容易通过简单的数据流优化器实现来跟踪。

This way, things come more naturally and compiler should be expected to do something, since this is so easy to be tracked by a simple data flow optimizer implementation.

endian.bytes [0] 应该缩小为常数。

无论如何,这种方式取决于编译器。

Anyway, this way is compiler dependent.

这篇关于C语言的流行性检测和性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆