类型双关语和工会用C [英] Type punning and Unions in C

查看:112
本文介绍了类型双关语和工会用C的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我目前工作的一个项目建立一个小型的编译器只是它的挫折感。

I'm currently working on a project to build a small compiler just for the heck of it.

我已经决定采取建立一个非常简单的虚拟机的接近目标,所以我不担心学习的小精灵,Intel汇编等的来龙去脉。

I've decided to take the approach of building an extremely simple virtual machine to target so I don't have to worry about learning the ins and outs of elf, intel assembly, etc.

我的问题是关于使用C中的工会类型的双关语。我决定在虚拟机的内存只支持32位整数位和32位浮点值。为方便起见,虚拟机的主内存设置这样的:

My question is about type punning in C using unions. I've decided to only support 32 bit integers and 32 bit float values in the vm's memory. To facilitate this, the "main memory" of the vm is set up like this:

typedef union
{    
    int i;
    float f;
}word;


memory = (word *)malloc(mem_size * sizeof(word));

所以,我就可以,就像根据指令要么int或浮点处理内存部分。

So I can then just treat the memory section as either an int or a float depending on the instruction.

这是技术型双关语?这无疑是如果我是用整数作为记忆的单词,然后用一个浮点数*对待他们就像浮动。我目前的做法,而语法不同,我不认为是语义上的不同。最终我还是治疗内存32位作为要么是int或浮点数。

Is this technically type punning? It certainly would be if I were to use ints as the words of memory and then use a float* to treat them like floats. My current approach, while syntactically different, I don't think is semantically different. In the end I'm still treating 32 bits in memory as either an int or a float.

我能想出线上唯一的信息表明,这是实现有关。有没有达致这不浪费了一堆的空间更便携的方式?

The only information I could come up with online suggests that this is implementation dependent. Is there a more portable way to acheive this without wasting a bunch of space?

我可以做到以下几点,但我会占用超过2倍的内存和关于工会重新发明轮子。

I could do the following, but then I would be taking up more than 2 times as much memory and "reinventing the wheel" with respect to unions.

typedef struct
{
    int i;
    float f;
    char is_int;
}

修改

也许我没有让我的确切问题清楚。我知道我可以使用一个浮动或int从工会没有未定义的行为。我后特别是办法有,我可以作为一个int安全地使用或漂浮在不知道的最后一个值集显什么是32位内存位置。我想,以说明使用其它类型的情况

I perhaps didn't make my exact question clear. I am aware that I can use either a float or an int from a union without undefined behavior. What I'm after is specifically a way to have a 32 bit memory location that I can safely use as an int or float without knowing what the last value set was. I want to account for the situation where the other type is used.

推荐答案

是,存储联盟的一名成员和阅读另一个是夯实类型(假设类型十分不同)。此外,这是唯一的一种普遍的(任何类型的任何类型)的键入正式用C语言支持双关语。它在某种意义上支持该语言承诺,在这种情况下,类型双关将实际发生的,即,一个物理尝试读取一种类型的一个对象作为另一种类型的将发生的目的。除其他事项则意味着写联合的一个成员和读出另一个构件意味着写入和读出之间的数据依赖性。然而,这仍然给你留下了确保类型双关不产生陷阱重新presentation的负担。

Yes, storing one member of union and reading another is type punning (assuming the types are sufficiently different). Moreover, this is the only kind of universal (any type to any type) type punning that is officially supported by C language. It is supported in a sense that the language promises that in this case the type punning will actually occur, i.e. that a physical attempt to read an object of one type as an object of another type will take place. Among other things it means that writing one member of the union and reading another member implies a data dependency between the write and the read. This, however, still leaves you with the burden of ensuring that the type punning does not produce a trap representation.

当您使用铸造型双关(什么通常被理解为经典型双关语)的指针,语言明确规定,在一般情况下,这种行为是未定义(除了reinter preting对象的值作为数组字符和其他受限制的情况下)。像GCC编译器实现所谓的严格别名语义,这基本上意味着,当你期望它的工作基于指针的类型双关可能无法正常工作。例如,编译器可能(和会)之间忽略类型punned读取和写入,并擅自重新排列,从而彻底毁了你的意图的数据依赖性。这

When you use casted pointers for type punning (what is usually understood as "classic" type punning), the language explicitly states that in general case the behavior is undefined (aside from reinterpreting object's value as an array of chars and other restricted cases). Compilers like GCC implement so called "strict aliasing semantics", which basically means that the pointer-based type punning might not work as you expect it to work. For example, the compiler might (and will) ignore the data dependency between type-punned reads and writes and rearrange them arbitrarily, thus completely ruining your intent. This

int i;
float f;

i = 5;
f = *(float *) &i;

可以很容易地重新安排到实际

can be easily rearranged into actual

f = *(float *) &i;
i = 5;

具体地,因为严格锯齿编译故意忽略的写入,并在实施例的读出之间的数据依赖性的可能性。

specifically because a strict-aliased compiler deliberately ignores the possibility of data dependency between the write and the read in the example.

在一个现代的C编译器,当你真的需要执行一个对象的值作为另一种类型的值的物理reinter pretation,你被限制为的memcpy -ing从一个对象到另一个或基于联合式双关字节。有没有其他的方法。铸造指针不再是一个可行的选择。

In a modern C compiler, when you really need to perform physical reinterpretation of one objects value as value of another type, you are restricted to either memcpy-ing bytes from one object to another or to union-based type punning. There are no other ways. Casting pointers is no longer a viable option.

这篇关于类型双关语和工会用C的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆