为什么位字节序是位域中的一个问题? [英] Why bit endianness is an issue in bitfields?

查看:25
本文介绍了为什么位字节序是位域中的一个问题?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

任何使用位域的可移植代码似乎都能区分小端和大端平台.有关此类代码的示例,请参见 linux 内核中 struct iphdr 的声明.我不明白为什么位字节序是一个问题.

据我所知,位域纯粹是编译器构造,用于促进位级操作.

例如,考虑以下位域:<代码><预>struct ParsedInt {无符号整数 f1:1;无符号整数 f2:3;无符号整数 f3:4;};uint8_t i;struct ParsedInt *d = &i;

在这里,写 d->f2 只是一种简洁易读的表达方式 (i>>1) &(1<<4 - 1).

然而,位操作定义明确,无论架构如何都可以工作.那么,为什么位域不可移植?

解决方案

根据 C 标准,编译器几乎可以任意随机地存储位域.您不能永远对位的分配位置做出任何假设.这里只是一些 C 标准没有规定的与位域相关的东西:

未指明的行为

  • 分配用于保存位字段的可寻址存储单元的对齐方式 (6.7.2.1).

实现定义的行为

  • 位域是否可以跨越存储单元边界 (6.7.2.1).
  • 单元内位域的分配顺序 (6.7.2.1).

大/小端当然也是实现定义的.这意味着您的结构可以通过以下方式分配(假设为 16 位整数):

填充:8f1 : 1f2 : 3f3 : 4或者填充:8f3 : 4f2 : 3f1 : 1或者f1 : 1f2 : 3f3 : 4填充:8或者f3 : 4f2 : 3f1 : 1填充:8

哪个适用?猜一猜,或阅读编译器的深入后端文档.将大端或小端的 32 位整数的复杂性添加到此.然后添加一个事实,即允许编译器在位字段内的任何位置添加任意数量的填充字节,因为它被视为一个结构(它不能在结构的最开始添加填充),但在其他地方).

然后我什至没有提到如果您使用纯int"作为位域类型 = 实现定义的行为,或者如果您使用除(无符号)int = 实现定义的行为之外的任何其他类型,会发生什么.

所以回答这个问题,没有可移植位域代码这样的东西,因为C标准对于位域应该如何实现非常模糊.唯一可以信任的位域是布尔值块,程序员不关心这些位在内存中的位置.

唯一可移植的解决方案是使用按位运算符而不是位域.生成的机器代码将完全相同,但具有确定性.位运算符在任何系统的任何 C 编译器上都是 100% 可移植的.

Any portable code that uses bitfields seems to distinguish between little- and big-endian platforms. See the declaration of struct iphdr in linux kernel for an example of such code. I fail to understand why bit endianness is an issue at all.

As far as I understand, bitfields are purely compiler constructs, used to facilitate bit level manipulations.

For instance, consider the following bitfield:

struct ParsedInt {
    unsigned int f1:1;
    unsigned int f2:3;
    unsigned int f3:4;
};
uint8_t i;
struct ParsedInt *d = &i;

Here, writing d->f2 is simply a compact and readable way of saying (i>>1) & (1<<4 - 1).

However, bit operations are well-defined and work regardless of the architecture. So, how come bitfields are not portable?

解决方案

By the C standard, the compiler is free to store the bit field pretty much in any random way it wants. You can never make any assumptions of where the bits are allocated. Here are just a few bit-field related things that are not specified by the C standard:

Unspecified behavior

  • The alignment of the addressable storage unit allocated to hold a bit-field (6.7.2.1).

Implementation-defined behavior

  • Whether a bit-field can straddle a storage-unit boundary (6.7.2.1).
  • The order of allocation of bit-fields within a unit (6.7.2.1).

Big/little endian is of course also implementation-defined. This means that your struct could be allocated in the following ways (assuming 16 bit ints):

PADDING : 8
f1 : 1
f2 : 3
f3 : 4

or

PADDING : 8
f3 : 4
f2 : 3
f1 : 1

or

f1 : 1
f2 : 3
f3 : 4
PADDING : 8

or

f3 : 4
f2 : 3
f1 : 1
PADDING : 8

Which one applies? Take a guess, or read in-depth backend documentation of your compiler. Add the complexity of 32-bit integers, in big- or little endian, to this. Then add the fact that the compiler is allowed to add any number of padding bytes anywhere inside your bit field, because it is treated as a struct (it can't add padding at the very beginning of the struct, but everywhere else).

And then I haven't even mentioned what happens if you use plain "int" as bit-field type = implementation-defined behavior, or if you use any other type than (unsigned) int = implementation-defined behavior.

So to answer the question, there is no such thing as portable bit-field code, because the C standard is extremely vague with how bit fields should be implemented. The only thing bit-fields can be trusted with is to be chunks of boolean values, where the programmer isn't concerned of the location of the bits in memory.

The only portable solution is to use the bit-wise operators instead of bit fields. The generated machine code will be exactly the same, but deterministic. Bit-wise operators are 100% portable on any C compiler for any system.

这篇关于为什么位字节序是位域中的一个问题?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆