UB是通过将对象指针转换为`char *`,然后执行`*(member_type *)(pointer + offset)`来访问成员的吗? [英] Is it UB to access a member by casting an object pointer to `char *`, then doing `*(member_type*)(pointer + offset)`?

查看:41
本文介绍了UB是通过将对象指针转换为`char *`,然后执行`*(member_type *)(pointer + offset)`来访问成员的吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这里是一个例子:

#include <cstddef>
#include <iostream>

struct A
{
    char padding[7];
    int x;
};
constexpr int offset = offsetof(A, x);

int main()
{
    A a;
    a.x = 42;
    char *ptr = (char *)&a;
    std::cout << *(int *)(ptr + offset) << '\n'; // Well-defined or not?
}

我一直认为定义明确(否则, offsetof ),但不确定。

I always assumed that it's well-defined (otherwise what would be the point of offsetof), but wasn't sure.

最近我被告知实际上是UB,所以我想一劳永逸。

Recently I was told that it's in fact UB, so I want to figure it out once and for all.

上面的示例是否导致UB?如果将类修改为非标准布局,是否会影响结果?

Does the example above cause UB or not? If you modify the class to not be standard-layout, does it affect the result?

如果是UB,是否有解决方法(例如,应用 std :: launder )?

And if it's UB, are there any workarounds for it (e.g. applying std::launder)?

这整个话题似乎没有什么根据,

This entire topic seems to be moot and underspecified.

以下是我能够找到的一些信息:

Here's some information I was able to find:

  • Is adding to a "char *" pointer UB, when it doesn't actually point to a char array? - In 2011, CWG confirmed that we're allowed to examine the representation of a standard-layout object through an unsigned char pointer.


  • 不清楚是否可以固定使用 char 指针,常识认为可以。

  • 不清楚是否需要从C ++ 17开始 std :: launder 应用于(无符号字符*)的结果演员表。鉴于这将是一个重大突破,至少在实践中可能不必要。

  • Unclear if a char pointer can be used insteaed, common sense says it can.
  • Unclear if staring from C++17 std::launder needs to be applied to the result of the (unsigned char *) cast. Given that it would be a breaking change, it's probably unnecessarly, at least in practice.

不清楚为什么C ++ 17会更改 offsetof 有条件地支持非标准布局类型(以前是UB)。似乎暗示,如果实现支持,则还可以通过 unsigned char * 检查非标准布局对象的表示形式。

Unclear why C++17 changed offsetof to conditionally-support non-standard-layout types (used to be UB). It seems to imply that if an implementation supports that, then it also lets you examine the representation of non-standard-layout objects through unsigned char *.

在标准布局对象(例如,具有offsetof)内进行指针算术时,我们是否需要使用std :: launder? -与此类似的问题。没有给出确切的答案。

Do we need to use std::launder when doing pointer arithmetic within a standard-layout object (e.g., with offsetof)? - A question similar to this one. No definitive answer was given.

推荐答案

在这里,我将引用C + +20(草稿)措辞,因为在C ++之间修正了一个相关的编辑问题 17和C ++ 20,也可以在C ++ 20草案的HTML版本中引用特定的句子,但是与C ++ 17相比,没有其他新内容。

Here I will refer to C++20 (draft) wording, because one relevant editorial issue was fixed between C++17 and C++20 and also it is possible to refer to specific sentences in HTML version of the C++20 draft, but otherwise there is nothing new in comparison to C++17.

首先,指针值的定义 [basic.compound] / 3


指针类型的每个值都是以下值之一:

—指向对象或函数的指针(据说该指针指向该对象或函数),或者

-指向结束符的指针对象([expr.add]),或者

—该类型的空指针值,或者

—一个 i无效的指针值

Every value of pointer type is one of the following:
— a pointer to an object or function (the pointer is said to point to the object or function), or
— a pointer past the end of an object ([expr.add]), or
— the null pointer value for that type, or
— an invalid pointer value.

现在,让我们看看(char *)& ; a 表达式。

Now, lets see what happens in the (char *)&a expression.

让我不证明 a 是一个左值表示 A 类型的对象,我会说«对象 a »来引用该对象。

Let me not prove that a is an lvalue denoting the object of type A, and I will say «the object a» to refer to this object.

& a 子表达式的含义在 [expr.unary.op] /(3.2)


如果操作数是 T 类型的左值,则结果表达式是指向<$ c的指针类型的prvalue。 $ c> T ,其结果是指向指定对象的指针

if the operand is an lvalue of type T, the resulting expression is a prvalue of type "pointer to T" whose result is a pointer to the designated object

因此,& a 是类型 A * 的prvalue,其值为«指向的指针(对象) a »。

So, &a is a prvalue of type A* with the value «pointer to (the object) a».

现在,强制转换为(char *)& a 等效于 reinterpret_cast< char *>(& a),其定义为 static_cast< char *>(static_cast< void *>(& a)) [expr.reinterpret.cast ] / 7 )。

Now, the cast in (char *)&a is equivalent to reinterpret_cast<char*>(&a), which is defined as static_cast<char*>(static_cast<void*>(&a)) ([expr.reinterpret.cast]/7).

强制转换为 void * 不会更改指针值(< a href = https://timsong-cpp.github.io/cppwp/n4861/conv.ptr#2 rel = nofollow noreferrer> [conv.ptr] / 2 ):

Cast to void* doesn't change the pointer value ([conv.ptr]/2):


类型为指向 cv T 的指针的prvalue,其中 T 是一种对象类型,可以转换为指向 cv void 。指针值([basic.compound])在此转换后保持不变。

A prvalue of type "pointer to cv T", where T is an object type, can be converted to a prvalue of type "pointer to cv void". The pointer value ([basic.compound]) is unchanged by this conversion.

即它仍然是 指向(对象) a »的指针。

i.e. it is still «pointer to (the object) a».

[expr.static.cast] / 13 涵盖了外部 static_cast< char *>(...)


可以将指向 cv1 void 的指针转换为指向 cv2 <$ c $的指针的prvalue。 c> T ,其中 T 是对象类型,而 cv2 与cv相同或更高cv资格认证,而不是 cv1
如果原始指针值表示内存中字节的地址A,并且A不满足 T 的对齐要求,则未指定结果指针值。
否则,如果原始指针值指向对象 a ,并且存在类型为 T b
$ c>(忽略cv限定词)可以与 a 进行指针互换,结果是指向 b 的指针。
否则,指针值将在转换后保持不变。

A prvalue of type "pointer to cv1 void" can be converted to a prvalue of type "pointer to cv2 T", where T is an object type and cv2 is the same cv-qualification as, or greater cv-qualification than, cv1. If the original pointer value represents the address A of a byte in memory and A does not satisfy the alignment requirement of T, then the resulting pointer value is unspecified. Otherwise, if the original pointer value points to an object a, and there is an object b of type T (ignoring cv-qualification) that is pointer-interconvertible with a, the result is a pointer to b. Otherwise, the pointer value is unchanged by the conversion.

没有类型为的对象char ,它可以与对象 a [basic.compound] / 4 ):

There is no object of type char which is pointer-interconvertible with the object a ([basic.compound]/4):


两个对象 a b pointer-interconvertible ,如果:

—它们是同一对象,或

—一个是联合对象,另一个是该对象的非静态数据成员([class.union]),或者

—一个是标准布局类对象,另一个是该对象的第一个非静态数据成员,或者,如果该对象没有非静态数据成员,则该对象的任何基类子对象([class.mem])或

—存在一个对象 c ,这样 a c 是指针可互换的,而 c b 是指针可互换的。

Two objects a and b are pointer-interconvertible if:
— they are the same object, or
— one is a union object and the other is a non-static data member of that object ([class.union]), or
— one is a standard-layout class object and the other is the first non-static data member of that object, or, if the object has no non-static data members, any base class subobject of that object ([class.mem]), or
— there exists an object c such that a and c are pointer-interconvertible, and c and b are pointer-interconvertible.

这意味着 static_cast< char *>(... )不会更改指针值,并且与其操作数相同,即:«指向的指针 a »。

which means that the static_cast<char*>(...) doesn't change the pointer value and it is the same as in its operand, namely: «pointer to a».

因此,(char *)& a 是类型 char * ,其值为«指向 a »的指针。此值存储在 char * ptr 变量中。然后,当您尝试使用这样的值(即 ptr +偏移量)进行指针算术运算时,您将进入 [expr.add] / 6

So, (char *)&a is a prvalue of type char* whose value is «pointer to a». This value is stored into char* ptr variable. Then, when you try to do pointer arithmetic with such a value, namely ptr + offset, you step into [expr.add]/6:


对于加法或减法,如果表达式 P Q 具有指向 cv T ,其中 T 与数组元素类型不相似,其行为是未定义。

For addition or subtraction, if the expressions P or Q have type "pointer to cv T", where T and the array element type are not similar, the behavior is undefined.

出于指针算术的目的,考虑对象 a 成为数组 A [1] [basic.compound] / 3 ),因此数组元素类型为 A 指针表达式 P 是«指向 char »的指针, char A 不是相似的类型(请参见 [conv.qual] / 2 ),因此行为未定义。

For the purposes of pointer arithmetic, the object a is considered to be an element of an array A[1] ([basic.compound]/3), so the array element type is A, the type of the pointer expression P is «pointer to char», char and A are not similar types (see [conv.qual]/2), so the behavior is undefined.

这篇关于UB是通过将对象指针转换为`char *`,然后执行`*(member_type *)(pointer + offset)`来访问成员的吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆