允许使用char *来别名T *.也可以反过来吗? [英] Aliasing T* with char* is allowed. Is it also allowed the other way around?

查看:82
本文介绍了允许使用char *来别名T *.也可以反过来吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

注意:此问题已被重命名并减少,以使其更加突出重点和可读性.大多数评论都引用旧文本.

Note: This question has been renamed and reduced to make it more focused and readable. Most of the comments refer to the old text.

根据标准,不同类型的对象可能不会共享相同的存储位置.因此,这是不合法的:

According to the standard, objects of different type may not share the same memory location. So this would not be legal:

std::array<short, 4> shorts;
int* i = reinterpret_cast<int*>(shorts.data()); // Not OK

但是,该标准允许对此规则进行例外处理:可以通过指向charunsigned char的指针访问任何对象:

The standard, however, allows an exception to this rule: any object may be accessed through a pointer to char or unsigned char:

int i = 0;
char * c = reinterpret_cast<char*>(&i); // OK

但是,我不清楚是否也允许这样做.例如:

However, it is not clear to me whether this is also allowed the other way around. For example:

char * c = read_socket(...);
unsigned * u = reinterpret_cast<unsigned*>(c); // huh?

推荐答案

由于涉及指针转换,因此您的某些代码值得怀疑.请记住,在这些情况下,reinterpret_cast<T*>(e)具有static_cast<T*>(static_cast<void*>(e))的语义,因为所涉及的类型是标准布局. (实际上,我建议您在处理存储时总是 通过cv void*使用static_cast.)

Some of your code is questionable due to the pointer conversions involved. Keep in mind that in those instances reinterpret_cast<T*>(e) has the semantics of static_cast<T*>(static_cast<void*>(e)) because the types that are involved are standard-layout. (I would in fact recommend that you always use static_cast via cv void* when dealing with storage.)

仔细阅读该标准建议,在往返于T*的指针转换过程中,假定确实存在涉及的实际对象T*,即使在某些摘录中也很难实现由于所涉及类型的琐碎性而作弊"(稍后会详细介绍).但这不是重点,因为...

A close reading of the Standard suggests that during a pointer conversion to or from T* it is assumed that there really is an actual object T* involved -- which is hard to fulfill in some of your snippet, even when 'cheating' thanks to the triviality of types involved (more on this later). That would be besides the point however because...

混叠与指针转换无关.这是C ++ 11文本,概述了从3.10 Lvalues和rvalues [basic.lval ]:

Aliasing is not about pointer conversions. This is the C++11 text that outlines the rules that are commonly referred to as 'strict aliasing' rules, from 3.10 Lvalues and rvalues [basic.lval]:

10如果程序尝试通过以下类型之一以外的glvalue访问对象的存储值,则行为未定义:

10 If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:

  • 对象的动态类型,
  • 对象的动态类型的cv限定版本,
  • 类似于对象的动态类型的类型(定义见4.4)
  • 一种类型,它是与对象的动态类型相对应的有符号或无符号类型,
  • 一种类型,它是与对象的动态类型的cv限定版本相对应的有符号或无符号类型,
  • 在其元素或非静态数据成员(包括递归地包括子聚合或所包含的并集的元素或非静态数据成员)中包括上述类型之一的集合或联合类型,
  • 一种类型,它是对象动态类型的(可能是cv限定的)基类类型,
  • 字符或无符号字符类型.
  • the dynamic type of the object,
  • a cv-qualified version of the dynamic type of the object,
  • a type similar (as defined in 4.4) to the dynamic type of the object,
  • a type that is the signed or unsigned type corresponding to the dynamic type of the object,
  • a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,
  • an aggregate or union type that includes one of the aforementioned types among its elements or non-static data members (including, recursively, an element or non-static data member of a subaggregate or contained union),
  • a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,
  • a char or unsigned char type.

(这是C ++ 03中同一子句的第15段,对文本进行了一些细微更改,例如使用"lvalue"代替"glvalue",因为后者是C ++ 11的概念. )

(This is paragraph 15 of the same clause and subclause in C++03, with some minor changes in the text with e.g. 'lvalue' being used instead of 'glvalue' since the latter is a C++11 notion.)

根据这些规则,我们假设一个实现为我们提供了magic_cast<T*>(p),它以某种方式"将一个指针转换为另一种指针类型.通常,此 将为reinterpret_cast,在某些情况下会产生未指定的结果,但是正如我之前所解释的那样,对于指向标准布局类型的指针来说并非如此.那么显然所有的代码片段都是正确的(用magic_cast替换reinterpret_cast),因为magic_cast的结果不涉及任何glvalues.

In the light of those rules, let's assume that an implementation provides us with magic_cast<T*>(p) which 'somehow' converts a pointer to another pointer type. Normally this would be reinterpret_cast, which yields unspecified results in some cases, but as I've explained before this is not so for pointers to standard-layout types. Then it's plainly true that all of your snippets are correct (substituting reinterpret_cast with magic_cast), because no glvalues are involved whatsoever with the results of magic_cast.

这是一个出现错误使用magic_cast的代码段,但我认为这是正确的:

Here is a snippet that appears to incorrectly use magic_cast, but which I will argue is correct:

// assume constexpr max
constexpr auto alignment = max(alignof(int), alignof(short));
alignas(alignment) char c[sizeof(int)];
// I'm assuming here that the OP really meant to use &c and not c
// this is, however, inconsequential
auto p = magic_cast<int*>(&c);
*p = 42;
*magic_cast<short*>(p) = 42;

为证明我的推理的正确性,请假定使用以下表面上不同的代码段:

To justify my reasoning, assume this superficially different snippet:

// alignment same as before
alignas(alignment) char c[sizeof(int)];

auto p = magic_cast<int*>(&c);
// end lifetime of c
c.~decltype(c)();
// reuse storage to construct new int object
new (&c) int;

*p = 42;

auto q = magic_cast<short*>(p);
// end lifetime of int object
p->~decltype(0)();
// reuse storage again
new (p) short;

*q = 42;

此代码段经过精心构造.特别是,在new (&c) int;中,即使由于<3.8对象生存期[basic.life]第5段中列出的规则而销毁了c,我仍允许使用&c.相同的第6段对存储的引用给出了非常相似的规则,第7段解释了一旦对象被重用后用于引用对象的变量,指针和引用会发生什么情况-我将统称为3.8/5- 7.

This snippet is carefully constructed. In particular, in new (&c) int; I'm allowed to use &c even though c was destroyed due to the rules laid out in paragraph 5 of 3.8 Object lifetime [basic.life]. Paragraph 6 of same gives very similar rules to references to storage, and paragraph 7 explains what happens to variables, pointers and references that used to refer to an object once its storage is reused -- I will refer collectively to those as 3.8/5-7.

在这种情况下,&c被(隐式)转换为void*,这是对尚未重用的存储的指针的正确使用之一.类似地,在构造新的int之前,从&c获得p.它的定义也许可以在销毁c之后移至其上,具体取决于实现魔术的深度,但当然不能在int构造之后引入:第7段将适用,这不是允许的情况之一. short对象的构造还依赖于p成为指向存储的指针.

In this instance &c is (implicitly) converted to void*, which is one of the correct use of a pointer to storage that has not been yet reused. Similarly p is obtained from &c before the new int is constructed. Its definition could perhaps be moved to after the destruction of c, depending on how deep the implementation magic is, but certainly not after the int construction: paragraph 7 would apply and this is not one of the allowed situations. The construction of the short object also relies on p becoming a pointer to storage.

现在,由于intshort是琐碎的类型,因此我不必使用对析构函数的显式调用.我也不需要对构造函数的显式调用(也就是说,对在<new>中声明的通常的Standard Placement new的调用).从3.8对象生存期开始[basic.life]:

Now, because int and short are trivial types, I don't have to use the explicit calls to destructors. I don't need the explicit calls to the constructors, either (that is to say, the calls to the usual, Standard placement new declared in <new>). From 3.8 Object lifetime [basic.life]:

1 [...]类型为T的对象的生存期始于以下时间:

1 [...] The lifetime of an object of type T begins when:

  • 获得具有正确对齐方式和大小的T型存储,并且
  • 如果对象具有非平凡的初始化,则其初始化完成.

类型为T的对象的生存期在以下情况下终止:

The lifetime of an object of type T ends when:

  • 如果T是具有非平凡析构函数(12.4)的类类型,则析构函数调用开始,或者
  • 对象占用的存储空间被重用或释放.

这意味着我可以重写代码,以便在折叠中间变量q之后,我得到的是原始代码段.

This means that I can rewrite the code such that, after folding the intermediate variable q, I end up with the original snippet.

请注意,p不能折叠.也就是说,以下绝对是错误的:

Do note that p cannot be folded away. That is to say, the following is defintively incorrect:

alignas(alignment) char c[sizeof(int)];
*magic_cast<int*>(&c) = 42;
*magic_cast<short*>(&c) = 42;

如果我们假设int对象是(用平凡的方式)用第二行构造的,那么这必须意味着&c成为指向已重用的存储的指针.因此,第三行是不正确的-尽管是由于3.8/5-7引起的,而不是严格按照别名规则产生的.

If we assume that an int object is (trivially) constructed with the second line, then that must mean &c becomes a pointer to storage that has been reused. Thus the third line is incorrect -- although due to 3.8/5-7 and not due to aliasing rules strictly speaking.

如果我们不这样做,则第二行 违反了别名规则:我们正在通过类型为int的glvalue读取实际上是char c[sizeof(int)]对象的东西,这不是允许的例外之一.相比之下,*magic_cast<unsigned char>(&c) = 42;会很好(我们假设short对象是在第三行上琐碎构造的).

If we don't assume that, then the second line is a violation of aliasing rules: we're reading what is actually a char c[sizeof(int)] object through a glvalue of type int, which is not one of the allowed exception. By comparison, *magic_cast<unsigned char>(&c) = 42; would be fine (we would assume a short object is trivially constructed on the third line).

就像Alf一样,我还建议您在使用存储设备时显式使用新的标准放置位置.对于普通类型,跳过破坏是可以的,但是遇到*some_magic_pointer = foo;时,您很有可能会遇到违反3.8/5-7(无论如何神奇地获得了指针)或别名规则的情况.这也意味着也要存储新表达式的结果,因为一旦构造了对象,很可能就无法重用魔术指针了-再次是3.8/5-7.

Just like Alf, I would also recommend that you explicitly make use of the Standard placement new when using storage. Skipping destruction for trivial types is fine, but when encountering *some_magic_pointer = foo; you're very much likely facing either a violation of 3.8/5-7 (no matter how magically that pointer was obtained) or of the aliasing rules. This means storing the result of the new expression, too, since you most likely can't reuse the magic pointer once your object is constructed -- due to 3.8/5-7 again.

但是,读取对象的字节(这意味着使用charunsigned char)是可以的,甚至根本不使用reinterpret_cast或任何魔术.通过cv void*通过static_cast可以很好地完成工作(尽管我确实认为标准可以在其中使用更好的措辞).

Reading the bytes of an object (this means using char or unsigned char) is fine however, and you don't even to use reinterpret_cast or anything magic at all. static_cast via cv void* is arguably fine for the job (although I do feel like the Standard could use some better wording there).

这篇关于允许使用char *来别名T *.也可以反过来吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
相关文章
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆