UB是通过将对象指针转换为`char *`,然后执行`*(member_type *)(pointer + offset)`来访问成员的吗? [英] Is it UB to access a member by casting an object pointer to `char *`, then doing `*(member_type*)(pointer + offset)`?
问题描述
这里是一个例子:
#include <cstddef>
#include <iostream>
struct A
{
char padding[7];
int x;
};
constexpr int offset = offsetof(A, x);
int main()
{
A a;
a.x = 42;
char *ptr = (char *)&a;
std::cout << *(int *)(ptr + offset) << '\n'; // Well-defined or not?
}
我一直认为定义明确(否则, offsetof
),但不确定。
I always assumed that it's well-defined (otherwise what would be the point of offsetof
), but wasn't sure.
最近我被告知实际上是UB,所以我想一劳永逸。
Recently I was told that it's in fact UB, so I want to figure it out once and for all.
上面的示例是否导致UB?如果将类修改为非标准布局,是否会影响结果?
Does the example above cause UB or not? If you modify the class to not be standard-layout, does it affect the result?
如果是UB,是否有解决方法(例如,应用 std :: launder
)?
And if it's UB, are there any workarounds for it (e.g. applying std::launder
)?
这整个话题似乎没有什么根据,
This entire topic seems to be moot and underspecified.
以下是我能够找到的一些信息:
Here's some information I was able to find:
-
在实际上不指向char数组的情况下,是否将其添加到 char *指针UB? 2011年,CWG确认允许我们通过
unsigned char
指针检查标准布局对象的表示形式。
Is adding to a "char *" pointer UB, when it doesn't actually point to a char array? - In 2011, CWG confirmed that we're allowed to examine the representation of a standard-layout object through an
unsigned char
pointer.
- 不清楚是否可以固定使用
char
指针,常识认为可以。 -
不清楚是否需要从C ++ 17开始
std :: launder
应用于(无符号字符*)的结果
演员表。鉴于这将是一个重大突破,至少在实践中可能不必要。
- Unclear if a
char
pointer can be used insteaed, common sense says it can. Unclear if staring from C++17
std::launder
needs to be applied to the result of the(unsigned char *)
cast. Given that it would be a breaking change, it's probably unnecessarly, at least in practice.
不清楚为什么C ++ 17会更改 offsetof
有条件地支持非标准布局类型(以前是UB)。似乎暗示,如果实现支持,则还可以通过 unsigned char *
检查非标准布局对象的表示形式。
Unclear why C++17 changed offsetof
to conditionally-support non-standard-layout types (used to be UB). It seems to imply that if an implementation supports that, then it also lets you examine the representation of non-standard-layout objects through unsigned char *
.
在标准布局对象(例如,具有offsetof)内进行指针算术时,我们是否需要使用std :: launder? -与此类似的问题。没有给出确切的答案。
Do we need to use std::launder when doing pointer arithmetic within a standard-layout object (e.g., with offsetof)? - A question similar to this one. No definitive answer was given.
推荐答案
在这里,我将引用C + +20(草稿)措辞,因为在C ++之间修正了一个相关的编辑问题 17和C ++ 20,也可以在C ++ 20草案的HTML版本中引用特定的句子,但是与C ++ 17相比,没有其他新内容。
Here I will refer to C++20 (draft) wording, because one relevant editorial issue was fixed between C++17 and C++20 and also it is possible to refer to specific sentences in HTML version of the C++20 draft, but otherwise there is nothing new in comparison to C++17.
首先,指针值的定义 [basic.compound] / 3 :
指针类型的每个值都是以下值之一:
—指向对象或函数的指针(据说该指针指向该对象或函数),或者
-指向结束符的指针对象([expr.add]),或者
—该类型的空指针值,或者
—一个 i无效的指针值。
Every value of pointer type is one of the following:
— a pointer to an object or function (the pointer is said to point to the object or function), or
— a pointer past the end of an object ([expr.add]), or
— the null pointer value for that type, or
— an invalid pointer value.
现在,让我们看看(char *)& ; a
表达式。
Now, lets see what happens in the (char *)&a
expression.
让我不证明 a
是一个左值表示 A
类型的对象,我会说«对象 a
»来引用该对象。
Let me not prove that a
is an lvalue denoting the object of type A
, and I will say «the object a
» to refer to this object.
& a
子表达式的含义在 [expr.unary.op] /(3.2):
如果操作数是
T
类型的左值,则结果表达式是指向<$ c的指针类型的prvalue。 $ c> T ,其结果是指向指定对象的指针
if the operand is an lvalue of type
T
, the resulting expression is a prvalue of type "pointer toT
" whose result is a pointer to the designated object
因此,& a
是类型 A *
的prvalue,其值为«指向的指针(对象) a
»。
So, &a
is a prvalue of type A*
with the value «pointer to (the object) a
».
现在,强制转换为(char *)& a
等效于 reinterpret_cast< char *>(& a)
,其定义为 static_cast< char *>(static_cast< void *>(& a))
( [expr.reinterpret.cast ] / 7 )。
Now, the cast in (char *)&a
is equivalent to reinterpret_cast<char*>(&a)
, which is defined as static_cast<char*>(static_cast<void*>(&a))
([expr.reinterpret.cast]/7).
强制转换为 void *
不会更改指针值(< a href = https://timsong-cpp.github.io/cppwp/n4861/conv.ptr#2 rel = nofollow noreferrer> [conv.ptr] / 2 ):
Cast to void*
doesn't change the pointer value ([conv.ptr]/2):
类型为指向 cv
T
的指针的prvalue,其中T
是一种对象类型,可以转换为指向 cvvoid $ c $的指针的prvalue。 c>。指针值([basic.compound])在此转换后保持不变。
A prvalue of type "pointer to cv
T
", whereT
is an object type, can be converted to a prvalue of type "pointer to cvvoid
". The pointer value ([basic.compound]) is unchanged by this conversion.
即它仍然是 指向(对象) a
»的指针。
i.e. it is still «pointer to (the object) a
».
[expr.static.cast] / 13 涵盖了外部 static_cast< char *>(...)
:
可以将指向 cv1
void
的指针转换为指向 cv2 <$ c $的指针的prvalue。 c> T ,其中T
是对象类型,而 cv2 与cv相同或更高cv资格认证,而不是 cv1 。
如果原始指针值表示内存中字节的地址A,并且A不满足T
的对齐要求,则未指定结果指针值。
否则,如果原始指针值指向对象 a ,并且存在类型为T $ c的对象 b $ c>(忽略cv限定词)可以与 a 进行指针互换,结果是指向 b 的指针。
否则,指针值将在转换后保持不变。
A prvalue of type "pointer to cv1
void
" can be converted to a prvalue of type "pointer to cv2T
", whereT
is an object type and cv2 is the same cv-qualification as, or greater cv-qualification than, cv1. If the original pointer value represents the address A of a byte in memory and A does not satisfy the alignment requirement ofT
, then the resulting pointer value is unspecified. Otherwise, if the original pointer value points to an object a, and there is an object b of typeT
(ignoring cv-qualification) that is pointer-interconvertible with a, the result is a pointer to b. Otherwise, the pointer value is unchanged by the conversion.
没有类型为的对象char
,它可以与对象 a
( [basic.compound] / 4 ):
There is no object of type char
which is pointer-interconvertible with the object a
([basic.compound]/4):
两个对象 a 和 b 是 pointer-interconvertible ,如果:
—它们是同一对象,或
—一个是联合对象,另一个是该对象的非静态数据成员([class.union]),或者
—一个是标准布局类对象,另一个是该对象的第一个非静态数据成员,或者,如果该对象没有非静态数据成员,则该对象的任何基类子对象([class.mem])或
—存在一个对象 c ,这样 a 和 c 是指针可互换的,而 c 和 b 是指针可互换的。
Two objects a and b are pointer-interconvertible if:
— they are the same object, or
— one is a union object and the other is a non-static data member of that object ([class.union]), or
— one is a standard-layout class object and the other is the first non-static data member of that object, or, if the object has no non-static data members, any base class subobject of that object ([class.mem]), or
— there exists an object c such that a and c are pointer-interconvertible, and c and b are pointer-interconvertible.
这意味着 static_cast< char *>(... )
不会更改指针值,并且与其操作数相同,即:«指向的指针 a
»。
which means that the static_cast<char*>(...)
doesn't change the pointer value and it is the same as in its operand, namely: «pointer to a
».
因此,(char *)& a
是类型 char *
,其值为«指向 a
»的指针。此值存储在 char * ptr
变量中。然后,当您尝试使用这样的值(即 ptr +偏移量
)进行指针算术运算时,您将进入 [expr.add] / 6 :
So, (char *)&a
is a prvalue of type char*
whose value is «pointer to a
». This value is stored into char* ptr
variable. Then, when you try to do pointer arithmetic with such a value, namely ptr + offset
, you step into [expr.add]/6:
对于加法或减法,如果表达式
P
或Q
具有指向 cvT
,其中T
与数组元素类型不相似,其行为是未定义。
For addition or subtraction, if the expressions
P
orQ
have type "pointer to cvT
", whereT
and the array element type are not similar, the behavior is undefined.
出于指针算术的目的,考虑对象 a
成为数组 A [1]
( [basic.compound] / 3 ),因此数组元素类型为 A
指针表达式 P
是«指向 char
»的指针, char
和 A
不是相似的类型(请参见 [conv.qual] / 2 ),因此行为未定义。
For the purposes of pointer arithmetic, the object a
is considered to be an element of an array A[1]
([basic.compound]/3), so the array element type is A
, the type of the pointer expression P
is «pointer to char
», char
and A
are not similar types (see [conv.qual]/2), so the behavior is undefined.
这篇关于UB是通过将对象指针转换为`char *`,然后执行`*(member_type *)(pointer + offset)`来访问成员的吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!