C ++标准在哪里描述了对原语的指针的转换? [英] Where does the C++ standard describe the casting of pointers to primitives?

查看:85
本文介绍了C ++标准在哪里描述了对原语的指针的转换?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在优秀的博文每个程序员应该了解的未定义行为,违反类型规则部分说:


将一个int *转换为float *是未定义的行为,解除引用(访问int,好像它是一个float)。 C要求这些类型的转换通过memcpy发生:使用指针转换是不正确的,未定义的行为会导致。这个规则是非常微妙的,我不想在这里的细节(有一个例外,char *,向量有特殊的属性,工会改变的东西,等等)。


< blockquote>

我想了解他们完全细微差别的规则。他们在C ++ 11规范中?或者没有,C规范(C90,C99,C11)?



在从这个Stack Overflow问题,N3485,我在5.2.10Reinterpret cast,但没有看到char *或union的异常的语言。所以这可能不是正确的地方。

解决方案

你正在寻找的规则是在§3.10/ 10 11):


如果程序试图通过除了其中一个之外的glvalue访问对象
的存储值以下类型的
行为未定义:
- 对象的动态类型,



- 动态类型的cv-对象,



- 类似于对象动态类型的类型(如4.4中定义) - 类型是对应于对象的动态类型的有符号或无符号类型
- 类型是对应于动态类型的cv-qualal版本的有符号或无符号类型
的对象,



- 在其元素或非静态
数据成员之间包含上述类型之一的聚合或联合类型(包括递归,元素或非子集合
的静态数据成员或包含union),



- 一种类型,可能是cv-


有对象的





在投放 int * 到 float * 然后
解除引用它,很明显,标准不能定义
它,因为可能发生的将取决于架构,和
的值 int 。另一方面,引用的段落
是完全错误的,使用 memcpy 做转换是
也未定义的行为,大体上相同的原因。



未定义行为的动机之一是
允许实现以一种有意义的方式定义它
目标体系结构, em> if 。这是这样的
的情况。有意使其失败的编译器是
有缺陷。当然,如果我们假设32位2的补码
int 和32位IEEE float 期望
的某些值 int 对应于捕获NaN,这将导致程序
失败。这是行为是
未定义的部分原因;以允许这样的事情发生。但如果我们是
熟悉硬件的低级细节,
it 应该按预期工作,提供编译器可以看到
演员。
如果没有,这是编译器的QoI问题,这样的类型的工作应该避免这样的
a编译器。



如上所述,这种特殊情况,事实上,在涉及类型冲突的所有
情况下(写给
a union的一个成员,并从另一个成员读取,例如),做出
a的问题,标准还没有找到足够的
的字眼。发生这个问题是因为通常,编译器是
允许假设指向不同类型(除了
字符类型)的指针不是别名; int * 不能指向
float * 相同的对象。并且证明两个指针
不能别名对于优化是重要的。一个编译器,
破坏代码,其中指针cast或联合是清楚可见的是
只是破碎,即使标准说它是未定义的行为。
即使在
标准说明行为被明确定义的情况下,一个编译器可以断开所有它看到的两个指针
到无关类型的代码。



使用 memcpy 通过使用两个不同的
对象避免了这个问题,这些对象不是别名。它仍然遇到未定义的
行为,因为将 int 的位模式放入
a float 然后访问float,没有任何定义的
行为。 (或者反之亦然;我知道至少有一台机器,其中
float 的位复制到 int 可能会导致
非法 int 值。)


In the excellent blog post What Every Programmer Should Know About Undefined Behavior, the section "Violating Type Rules" says:

It is undefined behavior to cast an int* to a float* and dereference it (accessing the "int" as if it were a "float"). C requires that these sorts of type conversions happen through memcpy: using pointer casts is not correct and undefined behavior results. The rules for this are quite nuanced and I don't want to go into the details here (there is an exception for char*, vectors have special properties, unions change things, etc).

I'd like to understand the rules in their full nuancedness. Where are they in the C++11 spec? Or failing that, the C spec (C90, C99, C11)?

In the C++11 spec linked from this Stack Overflow question, N3485, I'm looking in 5.2.10 "Reinterpret cast" but don't see language for an exception for char* or unions. So that's probably not the right place. So where is the right place?

解决方案

The rule you're looking for is in §3.10/10 (in C++11):

If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined: — the dynamic type of the object,

— a cv-qualified version of the dynamic type of the object,

— a type similar (as defined in 4.4) to the dynamic type of the object,

— a type that is the signed or unsigned type corresponding to the dynamic type of the object, — a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,

— an aggregate or union type that includes one of the aforementioned types among its elements or nonstatic data members (including, recursively, an element or non-static data member of a subaggregate or contained union),

— a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,

— a char or unsignedchar type.

There are different types (or motivations) for undefined behavior.

In the case of casting an int* to float* and then dereferencing it, it is clear that the standard cannot define it, since what might happen will depend on the architecture, and the value of the int. On the other hand, the quoted paragraph is completely wrong—using memcpy to do the conversion is also undefined behavior, for largely the same reasons.

One of the motivations for undefined behavior is to allow implementations to define it, in a manner that makes sense for the target architecture, if such exists. This is such a case. A compiler which intentionally causes it to fail is defective. Of course, if we suppose 32 bit 2's complement int, and 32 bit IEEE float, we may expect certain values of the int to correspond to trapping NaN, which will cause the program to fail. This is part of the reason the behavior is undefined; to allow such things to happen. But if we are familiar with the low level details of the hardware, it should work as expected, provided the compiler can see the cast. If it doesn't, this is a QoI problem with the compiler, and such a compiler should be avoided for such types of work.

As hinted at above, this particular case, and in fact, in all cases which involve type punning (writing to one member of a union, and reading from another, for example), do pose a problem, to which the standard has yet to find adequate wording. The problem occurs because normally, the compiler is allowed to assume that pointers to different types (except character types) do not alias; that an int* can never point to the same object as a float*. And proving that two pointers cannot alias is important for optimization. A compiler that breaks code where the pointer cast or the union is clearly visible is just broken, even if the standard says it is undefined behavior. A compiler that breaks code where all it sees are two pointers to unrelated types is understandable, even in cases where the standard says the behavior is well defined.

Using memcpy avoids this problem by using two different objects, which don't alias. It still encounters undefined behavior because putting the bit pattern of an int into a float, then accessing the float, doesn't have any defined behavior. (Or vice-versa; I know of at least one machine where copying the bits of a float into an int may result in an illegal int value.)

这篇关于C ++标准在哪里描述了对原语的指针的转换?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆