我理解的C / C ++正确严格走样? [英] Do I understand C/C++ strict-aliasing correctly?

查看:105
本文介绍了我理解的C / C ++正确严格走样?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我读过这篇文章关于C / C ++严格别名。我认为同样适用于C ++。

据我了解,严格别名是用来重新安排code优化性能。这就是为什么不同(在C ++中的情况下不相关)类型的两个指针不能引用相同的内存位置。

这是否意味着如果内存被修改的问题时才会发生?除了可能出现的问题<一href=\"http://stackoverflow.com/questions/7320766/where-can-i-find-documentation-on-c-memory-alignment-across-different-platforms\">with内存对齐。

例如,处理网络协议,或反序列化。我有一个字节数组,动态分配和数据包结构正确对齐。我能 reinter pret_cast 到我的包结构?

 字符常量* BUF = ...; //动态分配
unsigned int类型I = * reinter pret_cast&LT; unsigned int类型*&GT;(BUF +移); // [SHIFT]满足对齐要求


解决方案

这里的问题是不严格走样这么多的结构重新presentation要求。

首先,它是安全的字符之间的别名符号字符符号焦任何一个的其他类型(在你的情况, unsigned int类型。这使您可以编写自己的内存 - 复制循环,只要他们使用定义的字符键入这是由C99以下语言授权(§6.5):


  

&NBSP; 6。为它的存储值的访问的有效类型的一个对象是该对象的声明的类型,如果有的话。 [脚注:分配对象没有声明的类型] [...]如果一个值被复制到一个对象具有不使用声明的类型
  的memcpy或的memmove,或者被复制为字符类型的阵列,则有效类型
  对于访问和对不修改的后续访问修改的对象的
  值是有效类型从该值被复制,如果有对象的。对于
  所有其他的访问不具有声明类型的对象中,有效的对象的类型是
  简单地用于接入左值的类型。


  
  

&NBSP; 7。一个对象应具有其存储的值由左值前pression具有下列类型之一只访问:[脚注:这个名单的目的是指定一个对象可能会或可能不会被混淆的情况下]


  
  

      
  • 一个类型的有效对象的类型,
  • 兼容
      
  • [...]

  •   
  • 字符类型。

  •   

类似的语言可以在C ++ 0x的草案N3242§3.11/ 10可以发现,虽然它不是当一个对象的动态类型指定为清楚(我AP preciate任何进一步引用对动态类型是一个字符数组,到一个POD对象已被复制作为一个字符数组适当对齐)的东西。

这样,混叠是没有问题的在这里。但是,标准严格的读数表明一个C ++实现了自由大量在选择一个的再presentation unsigned int类型

作为一个随便举个例子, unsigned int类型 S可能会发生在四字节psented一个24位整数,再$ P $,与8填充比特穿插;如果这些填充比特不匹配一定的(常量)模式,它被视为一个陷阱重新presentation,并取消引用指针会导致崩溃。这是一个可能的实现?也许不是。但也出现了,历史上,有奇偶校验位和其他奇怪,所以直接从网络读取到 unsigned int类型系统由标准的严格解读的,是不是犹太

现在,填充比特的问题主要是今天大多数系统是一个理论问题,但它是值得一提。如果您打算坚持到PC硬件,你并不需要担心它(但不要忘记你的 ntohl 秒 - 字节顺序仍然是一个问题!)

结构使事情变得更糟,当然 - 重新对准presentations取决于你的平台上。我曾嵌入式平台中所有类型的具有1的排列上 - 没有填充的曾经的插入结构。这可以用在多种平台上相同的结构定义时导致不一致。您可以手动制定出数据结构成员的字节偏移,并直接引用,或使用特定的编译器对准指令来控制填充。

所以,你必须在从网络缓冲区本地类型或结构直接铸造小心。但混叠本身不是在这种情况下的一个问题。

I've read this article about C/C++ strict aliasing. I think the same applies to C++.

As I understand, strict aliasing is used to rearrange the code for performance optimization. That's why two pointers of different (and unrelated in C++ case) types cannot refer to the same memory location.

Does this mean that problems can occur only if memory is modified? Apart of possible problems with memory alignment.

For example, handling network protocol, or de-serialization. I have a byte array, dynamically allocated and packet struct is properly aligned. Can I reinterpret_cast it to my packet struct?

char const* buf = ...; // dynamically allocated
unsigned int i = *reinterpret_cast<unsigned int*>(buf + shift); // [shift] satisfies alignment requirements

解决方案

The problem here is not strict aliasing so much as structure representation requirements.

First, it is safe to alias between char, signed char, or unsigned char and any one other type (in your case, unsigned int. This allows you to write your own memory-copy loops, as long as they're defined using a char type. This is authorized by the following language in C99 (§6.5):

 6. The effective type of an object for an access to its stored value is the declared type of the object, if any. [Footnote: Allocated objects have no declared type] [...] If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access.

 7. An object shall have its stored value accessed only by an lvalue expression that has one of the following types: [Footnote: The intent of this list is to specify those circumstances in which an object may or may not be aliased.]

  • a type compatible with the effective type of the object,
  • [...]
  • a character type.

Similar language can be found in the C++0x draft N3242 §3.11/10, although it is not as clear when the 'dynamic type' of an object is assigned (I'd appreciate any further references on what the dynamic type is of a char array, to which a POD object has been copied as a char array with proper alignment).

As such, aliasing is not a problem here. However, a strict reading of the standard indicates that a C++ implementation has a great deal of freedom in choosing a representation of an unsigned int.

As one random example, unsigned ints might be a 24-bit integer, represented in four bytes, with 8 padding bits interspersed; if any of these padding bits does not match a certain (constant) pattern, it is viewed as a trap representation, and dereferencing the pointer will result in a crash. Is this a likely implementation? Perhaps not. But there have been, historically, systems with parity bits and other oddness, and so directly reading from the network into an unsigned int, by a strict reading of the standard, is not kosher.

Now, the problem of padding bits is mostly a theoretical issue on most systems today, but it's worth noting. If you plan to stick to PC hardware, you don't really need to worry about it (but don't forget your ntohls - endianness is still a problem!)

Structures make it even worse, of course - alignment representations depend on your platform. I have worked on an embedded platform in which all types have an alignment of 1 - no padding is ever inserted into structures. This can result in inconsistencies when using the same structure definitions on multiple platforms. You can either manually work out the byte offsets for data structure members and reference them directly, or use a compiler-specific alignment directive to control padding.

So you must be careful when directly casting from a network buffer to native types or structures. But the aliasing itself is not a problem in this case.

这篇关于我理解的C / C ++正确严格走样?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆