将一个工会重新解释为另一个工会 [英] Reinterpreting a union to a different union
问题描述
我有一个标准布局的联合,其中包含很多类型:
I have a standard-layout union that has a whole bunch of types in it:
union Big {
Hdr h;
A a;
B b;
C c;
D d;
E e;
F f;
};
通过A
到F
的每个类型都是标准布局,并且其第一个成员具有类型为Hdr
的对象. Hdr
标识联合的活动成员是什么,因此它是类似变体的.现在,我处于一种确定的状态(因为我检查了),该活动成员是B
或C
.实际上,我已将空间缩小为:
Each of the types A
thru F
is standard-layout and has as its first member an object of type Hdr
. The Hdr
identifies what the active member of the union is, so this is variant-like. Now, I'm in a situation where I know for certain (because I checked) that the active member is either a B
or a C
. Effectively, I've reduced the space to:
union Little {
Hdr h;
B b;
C c;
};
现在,以下行为是定义明确还是未定义的行为?
Now, is the following well-defined or undefined behavior?
void given_big(Big const& big) {
switch(big.h.type) {
case B::type: // fallthrough
case C::type:
given_b_or_c(reinterpret_cast<Little const&>(big));
break;
// ... other cases here ...
}
}
void given_b_or_c(Little const& little) {
if (little.h.type == B::type) {
use_a_b(little.b);
} else {
use_a_c(little.c);
}
}
Little
的目标是有效地充当文档,我已经检查过它是B
还是C
,因此将来没有人添加代码来检查它是A
还是其他.
The goal of Little
is to effectively serve as documentation, that I've already checked that it's a B
or C
so in the future nobody adds code to check that it's an A
or something.
我是否足以将B
子对象作为B
读取,以使其格式正确?可以在这里有意义地使用通用初始序列规则吗?
Is the fact that I am reading the B
subobject as a B
enough to make this well-formed? Can the common initial sequence rule meaningfully be used here?
推荐答案
要能够获取指向A的指针并将其重新解释为指向B的指针,它们必须是 pointer-interconvertible .
To be able to take a pointer to A, and reinterpret it as a pointer to B, they must be pointer-interconvertible.
可互换指针是关于对象,而不是对象类型.
Pointer-interconvertible is about objects, not types of objects.
在C ++中,某些地方有对象.如果在特定位置有一个Big
,并且至少有一个成员存在,则由于指针的相互转换性,在同一位置也有一个Hdr
.
In C++, there are objects at places. If you have a Big
at a particular spot with at least one member existing, there is also a Hdr
at that same spot due to pointer interconvertability.
但是,该位置没有Little
对象.如果没有Little
对象,则无法与不存在的Little
对象进行指针互换.
However there is no Little
object at that spot. If there is no Little
object there, it cannot be pointer-interconvertible with a Little
object that isn't there.
如果它们是平面数据(普通的旧数据,可微复制的等),它们似乎与布局兼容.
They appear to be layout-compatible, assuming they are flat data (plain old data, trivially copyable, etc).
这意味着您可以复制它们的 byte表示形式,并且可以正常工作.实际上,优化器似乎了解到对堆栈本地缓冲区的memcpy,新的放置(使用琐碎的构造函数),然后返回的memcpy实际上是noop.
This means you can copy their byte representation and it works. In fact, optimizers seem to understand that a memcpy to a stack local buffer, a placement new (with trivial constructor), then a memcpy back is actually a noop.
template<class T>
T* laundry_pod( void* data ) {
static_assert( std::is_pod<Data>{}, "POD only" ); // could be relaxed a bit
char buff[sizeof(T)];
std::memcpy( buff, data, sizeof(T) );
T* r = ::new( data ) T;
std::memcpy( data, buff, sizeof(T) );
return r;
}
上面的函数在运行时(在优化的版本中)是noop,但是它将data
的T布局兼容数据转换为实际的T
.
the above function is a noop at runtime (in an optimized build), yet it converts T-layout-compatible data at data
to an actual T
.
因此,如果我说得对,并且Big
是Little
中类型的子类型,并且Big
和Little
在布局上兼容,则可以执行以下操作:
So, if I am right and Big
and Little
are layout-compatible when Big
is a subtype of the types in Little
, you can do this:
Little* inplace_to_little( Big* big ) {
return laundry_pod<Little>(big);
}
Big* inplace_to_big( Little* big ) {
return laundry_pod<Big>(big);
}
或
void given_big(Big& big) { // cannot be const
switch(big.h.type) {
case B::type: // fallthrough
case C::type:
auto* little = inplace_to_little(&big); // replace Big object with Little inplace
given_b_or_c(*little);
inplace_to_big(little); // revive Big object. Old references are valid, barring const data or inheritance
break;
// ... other cases here ...
}
}
如果Big
具有非平面数据(例如引用或const
数据),则以上内容将严重破坏.
if Big
has non-flat data (like references or const
data), the above breaks horribly.
请注意,laundry_pod
不会进行任何内存分配;它使用new放置在data
使用data
的字节指向的地方构造T
的位置.而且看起来它正在做很多事情(在周围复制内存),但它可以优化为无故障状态.
Note that laundry_pod
doesn't do any memory allocation; it uses placement new that constructs a T
in the place where data
points using the bytes at data
. And while it looks like it is doing lots of stuff (copying memory around), it optimizes to a noop.
c ++ 有一个概念对象存在".对象的存在与物理或抽象机器中写入的位或字节几乎无关.您的二进制文件上没有对应于现在存在对象"的指令.
c++ has a concept of "an object exists". The existence of an object has almost nothing to do with what bits or bytes are written in the physical or abstract machine. There is no instruction on your binary that corresponds to "now an object exists".
但是语言有这个概念.
不存在的对象无法与之交互.如果这样做,则C ++标准不会定义程序的行为.
Objects that don't exist cannot be interacted with. If you do so, the C++ standard does not define the behavior of your program.
这使优化器可以对您的代码执行和不执行的操作以及无法到达的分支和可以到达的分支进行假设.它使编译器做出无混叠的假设.通过指针或对A的引用修改数据 除非通过某种方式将A和B都存在于同一位置,否则不能更改通过指针或对B的引用所到达的数据.
This permits the optimizer to make assumptions about what your code does and what it doesn't do and which branches cannot be reached and which can be reached. It lets the compiler make no-aliasing assumptions; modifying data through a pointer or reference to A cannot change data reached through a pointer or reference to B unless somehow both A and B exist in the same spot.
编译器可以证明Big
和Little
对象不能同时存在于同一位置.因此,通过指针或对Little
的引用进行的任何数据修改都不能修改类型为Big
的变量中存在的任何内容.反之亦然.
The compiler can prove that Big
and Little
objects cannot both exist in the same spot. So no modification of any data through a pointer or reference to Little
could modify anything existing in a variable of type Big
. And vice versa.
想象一下,如果given_b_or_c
修改了一个字段.好了,编译器可以内联given_big
和given_b_or_c
和use_a_b
,请注意,没有修改Big
的实例(只是Little
的一个实例),并证明了来自Big
的数据字段先前已缓存调用代码无法修改.
Imagine if given_b_or_c
modifies a field. Well the compiler could inline given_big
and given_b_or_c
and use_a_b
, notice that no instance of Big
is modified (just an instance of Little
), and prove that fields of data from Big
it cached prior to calling your code could not be modified.
这将为它保存一条加载指令,并且优化器非常满意.但是现在您的代码显示为:
This saves it a load instruction, and the optimizer is quite happy. But now you have code that reads:
Big b = whatever;
b.foo = 7;
((Little&)b).foo = 4;
if (b.foo!=4) exit(-1);
最喜欢的
Big b = whatever;
b.foo = 7;
((Little&)b).foo = 4;
exit(-1);
因为它可以证明b.foo
必须是7
,所以它只能被设置一次,而不能被修改.由于别名规则,通过Little
进行的访问无法修改Big
.
because it can prove that b.foo
must be 7
it was set once and never modified. The access through Little
could not modify the Big
due to aliasing rules.
现在执行此操作:
Big b = whatever;
b.foo = 7;
(*laundry_pod<Little>(&b)).foo = 4;
Big& b2 = *laundry_pod<Big>(&b);
if (b2.foo!=4) exit(-1);
并假设其中的大元素未更改,因为有一个memcpy和::new
可以合法地更改数据状态.没有严格的别名冲突.
and it the assume that the big there was unchanged, because there is a memcpy and a ::new
that could legally change the state of the data. No strict aliasing violation.
它仍然可以跟随memcpy
并消除它.
It can still follow the memcpy
and eliminate it.
laundry_pod
的
实时示例.请注意,如果未对其进行优化,则代码将必须具有条件和printf.但因为如此,它已被优化为空程序.
Live example of laundry_pod
being optimized away. Note that if it wasn't optimized away, the code would have to have a conditional and a printf. But because it was, it was optimized into the empty program.
这篇关于将一个工会重新解释为另一个工会的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!