是否保证填充比特为“零"?结构将在C中归零吗? [英] Is it guaranteed that the padding bits of "zeroed" structure will be zeroed in C?

查看:110
本文介绍了是否保证填充比特为“零"?结构将在C中归零吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

文章中的这一声明使我感到尴尬:

C允许实现将填充插入结构(但不能插入数组),以确保所有字段都具有针对目标的有用对齐方式. 如果将结构归零,然后设置一些字段,填充位会全部为零吗?根据调查结果,有36%的人会确定为填充,而有29%的人会填充为零.不知道.取决于编译器(和优化级别),可能是,也可能不是.

这还不是很清楚,所以我转向标准. 中的 ISO/IEC 9899 >§6.2.6.1状态:

当值存储在结构或联合类型的对象(包括成员对象)中时,与任何填充字节对应的对象表示形式的字节采用未指定的值.

也在§6.7.2.1中:

单位内位域的分配顺序(从高位到低位或从低位到高位)是实现定义的.未指定可寻址存储单元的对齐方式.

我刚刚记得我最近实现了某种黑客行为,在这里我使用了位域拥有的未声明字节部分.就像这样:

/* This struct is always allocated on the heap and is zeroed. */
struct some_struct {
  /* initial part ... */
  enum {
    ONE,
    TWO,
    THREE,
    FOUR,
  } some_enum:8;
  unsigned char flag:1;
  unsigned char another_flag:1;
  unsigned int size_of_smth;
  /* ... remaining part */
};

该结构不是我可以使用的,因此我无法更改它,但是我迫切需要通过它传递一些信息.因此,我计算了对应字节的地址,例如:

unsigned char *ptr = &some->size_of_smth - 1;
*ptr |= 0xC0; /* set flags */

然后,我以相同的方式检查标志.

我还要提到目标编译器和平台已定义,因此它不是跨平台的.但是,当前的问题仍然存在:

  1. 我是否可以依靠这样的事实:在memset/kzalloc/之后的任何内容以及后续使用之后,struct(在堆中)的填充位仍会归零? (这篇文章并未就标准和进一步使用struct的保护措施公开此主题.那么像= {0}这样的在堆栈上归零的struct呢?

  2. 如果是,这是否意味着我可以安全地使用位域的未命名"/未声明"部分来在C中的任何地方(不同平台,编译器等)传输一些信息以实现我的目的? (如果我确定没有人疯狂地尝试在此字节中存储任何内容).

解决方案

第一个问题的简短答案是否".

虽然memset()的适当调用(例如memset(&some_struct_instance, 0, sizeof(some_struct)))会将结构中的所有字节都设置为零,但在某些用途" some_struct_instance之后不需要将该更改持久化,例如设置以下任意一项:其中的成员.

因此,例如,不能保证some_struct_instance.some_enum = THREE(即将值存储到成员中)将使some_struct_instance中的任何填充位保持不变.该标准的唯一要求是该结构的其他成员的值不受影响.但是,编译器可以(在发出的目标代码或机器指令中)使用一组按位操作来实现分配,并允许采用不留下填充位的方式采取捷径(例如,不发出将否则请确保填充位不受影响.

更糟糕的是,像some_struct_instance = some_other_struct_instance这样的简单赋值(按照定义,就是将值存储到some_struct_instance中)无法保证填充位的值.不保证some_struct_instance中的填充位将被设置为与some_other_struct_instance中的填充位相同的按位值,也不保证some_struct_instance中的填充位将保持不变.这是因为允许编译器以其认为最有效"的方式实现分配(例如,逐字复制内存,按成员分配的某些集合等),但是-由于未指定分配后的填充位的值-不需要确保填充位不变.

如果您很幸运,并且对填充位有所了解可以满足您的目的,那么这并不是因为C标准中的任何支持.这是由于编译器供应商的良好风采(例如选择发出一组确保填充位不变的机器指令).而且,实际上,不能保证编译器供应商会继续以相同的方式执行操作-例如,当更新编译器,选择不同的优化设置或执行其他操作时,依赖于这种操作的代码可能会中断. /p>

由于第一个问题的答案为否",因此无需回答第二个问题.但是,从哲学上讲,如果试图将数据存储在结构的填充位中,则可以合理地断言其他人-可能是疯狂的-可能会尝试这样做同样的事情,只是使用一种弄乱您要传递的数据的方法.

This statement in the article made me embarrassed:

C permits an implementation to insert padding into structures (but not into arrays) to ensure that all fields have a useful alignment for the target. If you zero a structure and then set some of the fields, will the padding bits all be zero? According to the results of the survey, 36 percent were sure that they would be, and 29 percent didn't know. Depending on the compiler (and optimization level), it may or may not be.

It was not completely clear, so I turned to the standard. The ISO/IEC 9899 in §6.2.6.1 states:

When a value is stored in an object of structure or union type, including in a member object, the bytes of the object representation that correspond to any padding bytes take unspecified values.

Also in §6.7.2.1:

The order of allocation of bit-fields within a unit (high-order to low-order or low-order to high-order) is implementation-defined. The alignment of the addressable storage unit is unspecified.

I just remembered that I recently implemented let's say some kind of hack, where I used the not-declared part of byte owned by bit-field. It was something like:

/* This struct is always allocated on the heap and is zeroed. */
struct some_struct {
  /* initial part ... */
  enum {
    ONE,
    TWO,
    THREE,
    FOUR,
  } some_enum:8;
  unsigned char flag:1;
  unsigned char another_flag:1;
  unsigned int size_of_smth;
  /* ... remaining part */
};

The structure was not at my disposal therefore I couldn't change it, but I had an acute need to pass some information through it. So I calculated an address of corresponding byte like:

unsigned char *ptr = &some->size_of_smth - 1;
*ptr |= 0xC0; /* set flags */

Then later I checked flags the same way.

Also I should mention that the target compiler and platform were defined, so it's not a cross-platform thing. However, current questions are still take a place:

  1. Can I rely on the fact that the padding bits of struct (in heap) will be still zeroed after memset/kzalloc/whatever and after some subsequent using? (This post does not disclose the topic in terms of the standard and safeguards for the further use of struct). And what about struct zeroed on stack like = {0}?

  2. If yes, does it mean that I can safely use "unnamed"/"not declared" part of bit-field to transfer some info for my purposes everywhere (different platform, compiler, ..) in C? (If I know for sure that no one crazy is trying to store anything in this byte).

解决方案

The short answer to your first question is "no".

While an appropriate call of memset(), such as memset(&some_struct_instance, 0, sizeof(some_struct)) will set all bytes in the structure to zero, that change is not required to be persistent after "some use" of some_struct_instance, such as setting any of the members within it.

So, for example, there is no guarantee that some_struct_instance.some_enum = THREE (i.e. storing a value into a member) will leave any padding bits in some_struct_instance unchanged. The only requirement in the standard is that values of other members of the structure are unaffected. However, the compiler may (in emitted object code or machine instructions) implement the assignment using some set of bitwise operations, and be allowed to take shortcuts in a way that doesn't leave the padding bits alone (e.g. by not emitting instructions that would otherwise ensure the padding bits are unaffected).

Even worse, a simple assignment like some_struct_instance = some_other_struct_instance (which, by definition, is the storing of a value into some_struct_instance) comes with no guarantees about the values of padding bits. It is not guaranteed that the padding bits in some_struct_instance will be set to the same bitwise values as padding bits in some_other_struct_instance, nor is there a guarantee that the padding bits in some_struct_instance will be unchanged. This is because the compiler is allowed to implement the assignment in whatever means it deems most "efficient" (e.g. copying memory verbatim, some set of member-wise assignments, or whatever) but - since the value of padding bits after the assignment are unspecified - is not required to ensure the padding bits are unchanged.

If you get lucky, and fiddling with the padding bits works for your purpose, it will not be because of any support in the C standard. It will be because of good graces of the compiler vendor (e.g. choosing to emit a set of machine instructions that ensure padding bits are not changed). And, practically, there is no guarantee that the compiler vendor will keep doing things the same way - for example, your code that relies on such a thing may break when the compiler is updated, when you choose different optimisation settings, or whatever.

Since the answer to your first question is "no", there is no need to answer your second question. However, philosophically, if you are trying to store data in padding bits of a structure, it is reasonable to assert that someone else - crazy or not - may potentially attempt to do the same thing, but using an approach that messes up the data you are attempting to pass around.

这篇关于是否保证填充比特为“零"?结构将在C中归零吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆