允许带有`char *`的别名`T *`。是否也允许其他方式? [英] Aliasing `T*` with `char*` is allowed. Is it also allowed the other way around?

查看:113
本文介绍了允许带有`char *`的别名`T *`。是否也允许其他方式?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

请注意:此问题已重命名并缩小,以使其更加集中和可读。



根据标准,不同对象的对象类型可能不共享相同的内存位置。所以这不是合法的:

  std :: array< short,4>短裤; 
int * i = reinterpret_cast< int *>(shorts.data()); // Not OK

然而,标准允许此规则的例外:通过指向 char unsigned char 的指针:

  int i = 0; 
char * c = reinterpret_cast< char *>(& i); // OK

然而,我不清楚这是否也允许相反。例如:

  char * c = read_socket(...) 
unsigned * u = reinterpret_cast< unsigned *>(c); // huh?


解决方案

您的一些代码由于指针转换参与。请记住,在这些实例中 reinterpret_cast (e)具有 static_cast< T *>(static_cast < >(e)),因为涉及的类型是标准布局。 (我实际上建议您在处理时始终使用 static_cast 通过 cv void *



对标准的仔细观察表明,在 T * 的指针转换期间假设真的有一个实际的对象 T * 涉及 - 这是很难实现在一些你的代码段,即使欺骗感谢类型的微不足道(更多的这一点)。除了这一点之外,因为... 这是C + + 11的文本,概述了通常被称为严格别名规则的规则,从3.10 Lvalues和右值[basic.lval]:


程序尝试通过除以下类型之外的glvalue访问对象的存储值:行为未定义:




  • 动态类型

  • 对象动态类型的cv限定版本

  • 类型类似(如4.4中定义)对象的动态类型

  • 一种类型,它是对应于对象动态类型的签名或无符号类型,

  • 是对应于对象动态类型的cv限定版本的有符号或无符号类型,

  • 在其元素或非元素中包含上述类型之一的聚合或联合类型,静态数据成员(包括递归地包含子聚合或包含的并集的元素或非静态数据成员),

  • 一种类型,可能是cv限定的基类类型对象的动态类型,

  • 一个char或unsigned char类型。



b $ b

(这是C ++ 03中相同子句和子句的第15段,在文本中有一些细微的改变,例如'lvalue'而不是'glvalue',因为后者是一个C ++ 11的概念。)



根据这些规则,让我们假设一个实现提供我们用 magic_cast< T *>(p),它以某种方式将指针转换为另一个指针类型。通常这个 reinterpret_cast ,在某些情况下会产生未指定的结果,但是正如我之前解释的那样,布局类型。那么显然所有的代码片段是正确的(用 magic_cast 替换 reinterpret_cast ),因为没有glvalues magic_cast 的结果。



这里是出现的片段 magic_cast ,但我会认为是正确的:

  constexpr max 
constexpr auto alignment = max(alignof(int),alignof(short));
alignas(alignment)char c [sizeof(int)];
//我这里假设OP真的意味着使用& c,而不是c
//这是无关紧要的
auto p = magic_cast< int *> & c);
* p = 42;
* magic_cast< short *>(p)= 42;

为了证明我的推理,假设这个表面上不同的片段:

  //与之前一样对齐
alignas(alignment)char c [sizeof(int)];

auto p = magic_cast< int *>(& c);
//结束生命周期c
c。〜decltype(c)();
//重用存储来构造新的int对象
new(& c)int;

* p = 42;

auto q = magic_cast< short *>(p);
// int对象的结束生命周期
p->〜decltype(0)();
//再次使用存储器
new(p)short;

* q = 42;

这个代码段是精心构建的。特别是在 new(& c)int; 我允许使用& c code> c 因为3.8对象生命周期[basic.life]的第5段中规定的规则而被销毁。第6段给出了与存储引用非常相似的规则,第7段解释了对象在存储被重用后用于引用对象的变量,指针和引用发生了什么 - 我将统称为3.8 / 5- 7。



在此实例中,& c 被(隐式地)转换为 void * ,这是正确使用未被重用的存储的指针之一。类似地,在 int & c 获得 p >。它的定义可能会移动到破坏 c 后,取决于实现的魔法有多深,但肯定不是在 int 施工:第7段将适用,这不是允许的情况之一。 short 对象的构造也依赖于 p 成为指向存储的指针。



现在,因为 int short 是微不足道的类型,我不必使用对析构函数的显式调用。我不需要对构造函数的显式调用,也就是说,对< new> 中声明的通常的,标准放置新的调用)。从3.8对象生命周期[basic.life]:


1 [...]类型T的对象的生命周期开始于: / p>


  • 获得对于类型T的正确对齐和大小的存储,并且



  • 类型T的对象的生命周期结束于:




    • 如果T是具有非平凡析构函数(12.4)的类类型,析构函数调用开始,或



这意味着我可以重写代码在折叠中间变量 q 后,我结束了原始代码片段。



请注意, p 不能折叠。也就是说,以下是非正确的:

  alignas(alignment)char c [sizeof(int)]; 
* magic_cast< int *>(& c)= 42;
* magic_cast< short *>(& c)= 42;

如果我们假设 int (trivially)构造的第二行,那么必须意味着& c 成为一个指针指向已被重用的存储。因此,第三行是不正确的 - 虽然由于3.8 / 5-7,而不是严格来说,由于别名规则。



如果我们不假设,第二行是违反混叠规则:我们正在读一个实际上是一个 char c [sizeof(int)] 对象通过glvalue键入 int ,这不是允许的异常之一。相比之下, * magic_cast< unsigned char>(& c)= 42; 会很好(我们假设 short object in the third line)。



与Alf一样,我也建议您在使用存储时明确使用标准放置。跳过破坏琐碎的类型是好的,但当遇到 * some_magic_pointer = foo; 你很可能面临违反3.8 / 5-7(无论如何神奇获得指针)或别名规则。这意味着存储新表达式的结果,因为一旦你的对象被构​​造,你很可能不能重用魔术指针 - 由于3.8 / 5-7。



读取对象的字节(这意味着使用 char unsigned char )很好,你甚至不使用 reinterpret_cast 或任何魔法。 static_cast 通过 cv void * 可以说是很好的工作(虽然我觉得标准可以使用一些更好的措辞有)。


Note: This question has been renamed and reduced to make it more focused and readable. Most of the comments refer to the old text.

--

According to the standard, objects of different type may not share the same memory location. So this would not be legal:

std::array<short, 4> shorts;
int* i = reinterpret_cast<int*>(shorts.data()); // Not OK

The standard, however, allows an exception to this rule: any object may be accessed through a pointer to char or unsigned char:

int i = 0;
char * c = reinterpret_cast<char*>(&i); // OK

However, it is not clear to me whether this is also allowed the other way around. For example:

char * c = read_socket(...);
unsigned * u = reinterpret_cast<unsigned*>(c); // huh?

解决方案

Some of your code is questionable due to the pointer conversions involved. Keep in mind that in those instances reinterpret_cast<T*>(e) has the semantics of static_cast<T*>(static_cast<void*>(e)) because the types that are involved are standard-layout. (I would in fact recommend that you always use static_cast via cv void* when dealing with storage.)

A close reading of the Standard suggests that during a pointer conversion to or from T* it is assumed that there really is an actual object T* involved -- which is hard to fulfill in some of your snippet, even when 'cheating' thanks to the triviality of types involved (more on this later). That would be besides the point however because...

Aliasing is not about pointer conversions. This is the C++11 text that outlines the rules that are commonly referred to as 'strict aliasing' rules, from 3.10 Lvalues and rvalues [basic.lval]:

10 If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:

  • the dynamic type of the object,
  • a cv-qualified version of the dynamic type of the object,
  • a type similar (as defined in 4.4) to the dynamic type of the object,
  • a type that is the signed or unsigned type corresponding to the dynamic type of the object,
  • a type that is the signed or unsigned type corresponding to a cv-qualified version of the dynamic type of the object,
  • an aggregate or union type that includes one of the aforementioned types among its elements or non-static data members (including, recursively, an element or non-static data member of a subaggregate or contained union),
  • a type that is a (possibly cv-qualified) base class type of the dynamic type of the object,
  • a char or unsigned char type.

(This is paragraph 15 of the same clause and subclause in C++03, with some minor changes in the text with e.g. 'lvalue' being used instead of 'glvalue' since the latter is a C++11 notion.)

In the light of those rules, let's assume that an implementation provides us with magic_cast<T*>(p) which 'somehow' converts a pointer to another pointer type. Normally this would be reinterpret_cast, which yields unspecified results in some cases, but as I've explained before this is not so for pointers to standard-layout types. Then it's plainly true that all of your snippets are correct (substituting reinterpret_cast with magic_cast), because no glvalues are involved whatsoever with the results of magic_cast.

Here is a snippet that appears to incorrectly use magic_cast, but which I will argue is correct:

// assume constexpr max
constexpr auto alignment = max(alignof(int), alignof(short));
alignas(alignment) char c[sizeof(int)];
// I'm assuming here that the OP really meant to use &c and not c
// this is, however, inconsequential
auto p = magic_cast<int*>(&c);
*p = 42;
*magic_cast<short*>(p) = 42;

To justify my reasoning, assume this superficially different snippet:

// alignment same as before
alignas(alignment) char c[sizeof(int)];

auto p = magic_cast<int*>(&c);
// end lifetime of c
c.~decltype(c)();
// reuse storage to construct new int object
new (&c) int;

*p = 42;

auto q = magic_cast<short*>(p);
// end lifetime of int object
p->~decltype(0)();
// reuse storage again
new (p) short;

*q = 42;

This snippet is carefully constructed. In particular, in new (&c) int; I'm allowed to use &c even though c was destroyed due to the rules laid out in paragraph 5 of 3.8 Object lifetime [basic.life]. Paragraph 6 of same gives very similar rules to references to storage, and paragraph 7 explains what happens to variables, pointers and references that used to refer to an object once its storage is reused -- I will refer collectively to those as 3.8/5-7.

In this instance &c is (implicitly) converted to void*, which is one of the correct use of a pointer to storage that has not been yet reused. Similarly p is obtained from &c before the new int is constructed. Its definition could perhaps be moved to after the destruction of c, depending on how deep the implementation magic is, but certainly not after the int construction: paragraph 7 would apply and this is not one of the allowed situations. The construction of the short object also relies on p becoming a pointer to storage.

Now, because int and short are trivial types, I don't have to use the explicit calls to destructors. I don't need the explicit calls to the constructors, either (that is to say, the calls to the usual, Standard placement new declared in <new>). From 3.8 Object lifetime [basic.life]:

1 [...] The lifetime of an object of type T begins when:

  • storage with the proper alignment and size for type T is obtained, and
  • if the object has non-trivial initialization, its initialization is complete.

The lifetime of an object of type T ends when:

  • if T is a class type with a non-trivial destructor (12.4), the destructor call starts, or
  • the storage which the object occupies is reused or released.

This means that I can rewrite the code such that, after folding the intermediate variable q, I end up with the original snippet.

Do note that p cannot be folded away. That is to say, the following is defintively incorrect:

alignas(alignment) char c[sizeof(int)];
*magic_cast<int*>(&c) = 42;
*magic_cast<short*>(&c) = 42;

If we assume that an int object is (trivially) constructed with the second line, then that must mean &c becomes a pointer to storage that has been reused. Thus the third line is incorrect -- although due to 3.8/5-7 and not due to aliasing rules strictly speaking.

If we don't assume that, then the second line is a violation of aliasing rules: we're reading what is actually a char c[sizeof(int)] object through a glvalue of type int, which is not one of the allowed exception. By comparison, *magic_cast<unsigned char>(&c) = 42; would be fine (we would assume a short object is trivially constructed on the third line).

Just like Alf, I would also recommend that you explicitly make use of the Standard placement new when using storage. Skipping destruction for trivial types is fine, but when encountering *some_magic_pointer = foo; you're very much likely facing either a violation of 3.8/5-7 (no matter how magically that pointer was obtained) or of the aliasing rules. This means storing the result of the new expression, too, since you most likely can't reuse the magic pointer once your object is constructed -- due to 3.8/5-7 again.

Reading the bytes of an object (this means using char or unsigned char) is fine however, and you don't even to use reinterpret_cast or anything magic at all. static_cast via cv void* is arguably fine for the job (although I do feel like the Standard could use some better wording there).

这篇关于允许带有`char *`的别名`T *`。是否也允许其他方式?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆