将一个工会重新解释为另一个工会 [英] Reinterpreting a union to a different union

查看:69
本文介绍了将一个工会重新解释为另一个工会的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个标准布局的联合,其中包含很多类型:

I have a standard-layout union that has a whole bunch of types in it:

union Big {
    Hdr h;

    A a;
    B b;
    C c;
    D d;
    E e;
    F f;
};

通过AF的每个类型都是标准布局,并且其第一个成员具有类型为Hdr的对象. Hdr标识联合的活动成员是什么,因此它是类似变体的.现在,我处于一种确定的状态(因为我检查了),该活动成员是BC.实际上,我已将空间缩小为:

Each of the types A thru F is standard-layout and has as its first member an object of type Hdr. The Hdr identifies what the active member of the union is, so this is variant-like. Now, I'm in a situation where I know for certain (because I checked) that the active member is either a B or a C. Effectively, I've reduced the space to:

union Little {
    Hdr h;

    B b;
    C c;
};

现在,以下行为是定义明确还是未定义的行为?

Now, is the following well-defined or undefined behavior?

void given_big(Big const& big) {
    switch(big.h.type) {
    case B::type: // fallthrough
    case C::type:
        given_b_or_c(reinterpret_cast<Little const&>(big));
        break;
    // ... other cases here ...
    }
}

void given_b_or_c(Little const& little) {
    if (little.h.type == B::type) {
        use_a_b(little.b);
    } else {
        use_a_c(little.c);
    }
}

Little的目标是有效地充当文档,我已经检查过它是B还是C,因此将来没有人添加代码来检查它是A还是其他.

The goal of Little is to effectively serve as documentation, that I've already checked that it's a B or C so in the future nobody adds code to check that it's an A or something.

我是否足以将B子对象作为B读取,以使其格式正确?可以在这里有意义地使用通用初始序列规则吗?

Is the fact that I am reading the B subobject as a B enough to make this well-formed? Can the common initial sequence rule meaningfully be used here?

推荐答案

要能够获取指向A的指针并将其重新解释为指向B的指针,它们必须是 pointer-interconvertible .

To be able to take a pointer to A, and reinterpret it as a pointer to B, they must be pointer-interconvertible.

可互换指针是关于对象,而不是对象类型.

Pointer-interconvertible is about objects, not types of objects.

在C ++中,某些地方有对象.如果在特定位置有一个Big,并且至少有一个成员存在,则由于指针的相互转换性,在同一位置也有一个Hdr.

In C++, there are objects at places. If you have a Big at a particular spot with at least one member existing, there is also a Hdr at that same spot due to pointer interconvertability.

但是,该位置没有Little对象.如果没有Little对象,则无法与不存在的Little对象进行指针互换.

However there is no Little object at that spot. If there is no Little object there, it cannot be pointer-interconvertible with a Little object that isn't there.

如果它们是平面数据(普通的旧数据,可微复制的等),它们似乎与布局兼容.

They appear to be layout-compatible, assuming they are flat data (plain old data, trivially copyable, etc).

这意味着您可以复制它们的 byte表示形式,并且可以正常工作.实际上,优化器似乎了解到对堆栈本地缓冲区的memcpy,新的放置(使用琐碎的构造函数),然后返回的memcpy实际上是noop.

This means you can copy their byte representation and it works. In fact, optimizers seem to understand that a memcpy to a stack local buffer, a placement new (with trivial constructor), then a memcpy back is actually a noop.

template<class T>
T* laundry_pod( void* data ) {
  static_assert( std::is_pod<Data>{}, "POD only" ); // could be relaxed a bit
  char buff[sizeof(T)];
  std::memcpy( buff, data, sizeof(T) );
  T* r = ::new( data ) T;
  std::memcpy( data, buff, sizeof(T) );
  return r;
}

上面的函数在运行时(在优化的版本中)是noop,但是它将data的T布局兼容数据转换为实际的T.

the above function is a noop at runtime (in an optimized build), yet it converts T-layout-compatible data at data to an actual T.

因此,如果我说得对,并且BigLittle中类型的子类型,并且BigLittle在布局上兼容,则可以执行以下操作:

So, if I am right and Big and Little are layout-compatible when Big is a subtype of the types in Little, you can do this:

Little* inplace_to_little( Big* big ) {
  return laundry_pod<Little>(big);
}
Big* inplace_to_big( Little* big ) {
  return laundry_pod<Big>(big);
}

void given_big(Big& big) { // cannot be const
  switch(big.h.type) {
  case B::type: // fallthrough
  case C::type:
    auto* little = inplace_to_little(&big); // replace Big object with Little inplace
    given_b_or_c(*little); 
    inplace_to_big(little); // revive Big object.  Old references are valid, barring const data or inheritance
    break;
  // ... other cases here ...
  }
}

如果Big具有非平面数据(例如引用或const数据),则以上内容将严重破坏.

if Big has non-flat data (like references or const data), the above breaks horribly.

请注意,laundry_pod不会进行任何内存分配;它使用new放置在data使用data的字节指向的地方构造T的位置.而且看起来它正在做很多事情(在周围复制内存),但它可以优化为无故障状态.

Note that laundry_pod doesn't do any memory allocation; it uses placement new that constructs a T in the place where data points using the bytes at data. And while it looks like it is doing lots of stuff (copying memory around), it optimizes to a noop.

有一个概念对象存在".对象的存在与物理或抽象机器中写入的位或字节几乎无关.您的二进制文件上没有对应于现在存在对象"的指令.

c++ has a concept of "an object exists". The existence of an object has almost nothing to do with what bits or bytes are written in the physical or abstract machine. There is no instruction on your binary that corresponds to "now an object exists".

但是语言有这个概念.

不存在的对象无法与之交互.如果这样做,则C ++标准不会定义程序的行为.

Objects that don't exist cannot be interacted with. If you do so, the C++ standard does not define the behavior of your program.

这使优化器可以对您的代码执行和不执行的操作以及无法到达的分支和可以到达的分支进行假设.它使编译器做出无混叠的假设.通过指针或对A的引用修改数据 除非通过某种方式将A和B都存在于同一位置,否则不能更改通过指针或对B的引用所到达的数据.

This permits the optimizer to make assumptions about what your code does and what it doesn't do and which branches cannot be reached and which can be reached. It lets the compiler make no-aliasing assumptions; modifying data through a pointer or reference to A cannot change data reached through a pointer or reference to B unless somehow both A and B exist in the same spot.

编译器可以证明BigLittle对象不能同时存在于同一位置.因此,通过指针或对Little的引用进行的任何数据修改都不能修改类型为Big的变量中存在的任何内容.反之亦然.

The compiler can prove that Big and Little objects cannot both exist in the same spot. So no modification of any data through a pointer or reference to Little could modify anything existing in a variable of type Big. And vice versa.

想象一下,如果given_b_or_c修改了一个字段.好了,编译器可以内联given_biggiven_b_or_cuse_a_b,请注意,没有修改Big的实例(只是Little的一个实例),并证明了来自Big的数据字段先前已缓存调用代码无法修改.

Imagine if given_b_or_c modifies a field. Well the compiler could inline given_big and given_b_or_c and use_a_b, notice that no instance of Big is modified (just an instance of Little), and prove that fields of data from Big it cached prior to calling your code could not be modified.

这将为它保存一条加载指令,并且优化器非常满意.但是现在您的代码显示为:

This saves it a load instruction, and the optimizer is quite happy. But now you have code that reads:

Big b = whatever;
b.foo = 7;
((Little&)b).foo = 4;
if (b.foo!=4) exit(-1);

最喜欢的

Big b = whatever;
b.foo = 7;
((Little&)b).foo = 4;
exit(-1);

因为它可以证明b.foo必须是7,所以它只能被设置一次,而不能被修改.由于别名规则,通过Little进行的访问无法修改Big.

because it can prove that b.foo must be 7 it was set once and never modified. The access through Little could not modify the Big due to aliasing rules.

现在执行此操作:

Big b = whatever;
b.foo = 7;
(*laundry_pod<Little>(&b)).foo = 4;
Big& b2 = *laundry_pod<Big>(&b);
if (b2.foo!=4) exit(-1);

并假设其中的大元素未更改,因为有一个memcpy和::new可以合法地更改数据状态.没有严格的别名冲突.

and it the assume that the big there was unchanged, because there is a memcpy and a ::new that could legally change the state of the data. No strict aliasing violation.

它仍然可以跟随memcpy并消除它.

It can still follow the memcpy and eliminate it.

laundry_pod

实时示例.请注意,如果未对其进行优化,则代码将必须具有条件和printf.但因为如此,它已被优化为空程序.

Live example of laundry_pod being optimized away. Note that if it wasn't optimized away, the code would have to have a conditional and a printf. But because it was, it was optimized into the empty program.

这篇关于将一个工会重新解释为另一个工会的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆