实践中的联合,别名和类型对分:什么有效,什么无效? [英] Unions, aliasing and type-punning in practice: what works and what does not?

查看:142
本文介绍了实践中的联合,别名和类型对分:什么有效,什么无效?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我很难理解与GCC结合使用可以做什么和不能做什么。我阅读了问题(尤其是此处此处),但他们关注的是C ++标准,我觉得C ++标准与实践(常用的编译器)之间不匹配。



最近在阅读有关编译标志的信息时,在 GCC在线文档中发现了令人困惑的信息。 -fstrict-aliasing 。它说:


-fstrict-aliasing



允许编译器假定适用于所编译语言的最严格的别名规则。对于C(和C ++),这会根据表达式的类型激活优化。特别是,除非一种类型的对象几乎相同,否则假定一种类型的对象永远不会与另一种类型的对象位于相同的地址。
例如,一个 unsigned int 可以为一个 int 加上别名,但不能为 void * double 。字符类型可以别名任何其他类型。
特别注意以下代码:

  union a_union {
int i;
double d;
};

int f(){
union a_union t;
t.d = 3.0;
t.i;
}

这种做法是从不同于最近写给工会会员的另一位工会成员那里朗读(称为类型惩罚)很常见。
即使使用-fstrict-aliasing,只要通过联合类型访问内存,也可以进行类型修剪。因此,上面的代码按预期方式工作。


这是我从这个示例和疑惑中所理解的:



1)别名仅在相似类型或字符之间起作用

结果为1 ):别名-顾名思义-是当您有一个值和两个成员来访问它时(即相同的字节);



疑问::当两个类型的字节大小相同时,它们是否相似?如果不是,什么是相似类型?



结果为1)(对于非相似类型)(无论这意味着什么),别名不起作用;



2)类型修剪是指我们读取的成员与写入的成员不同;这很常见,只要通过联合类型访问内存,它就可以按预期工作;



怀疑:是特定类型的别名-punning类型相似的地方?



我很困惑,因为它说unsigned int和double不相似,所以别名不起作用。然后在该示例中,它是int和double之间的别名,它清楚地表明它可以按预期工作,但是将其称为类型调整:
不是因为类型相似或不相似,而是因为它是从成员读取的,因此未写入。但是,从成员那里读来的却没有写成是我理解的别名的意思(正如该词所暗示的)。我迷路了。



问题:
可以澄清别名和类型处理之间的区别,以及两种技术的作用是什么预期在海湾合作委员会?编译器标志的作用是什么?

解决方案

别名可以按字面意思理解:是指两个不同的表达式引用时到同一个对象。类型绑定是对一个类型进行双关,即使用某种类型的对象作为其他类型。



正式而言,类型绑定是未定义的行为,只有一个很少有例外。

  int尾数(f浮点数)
{
return( int&)f& 0x7FFFFF; //访问浮点数,就好像它是一个整数一样
}

例外是(简化的)




  • 以整数的形式访问其无符号/有符号的对象

  • char 未签名字符 std :: byte



这称为严格别名规则:编译器可以安全地假设不同类型的两个表达式从不引用同一对象(上述例外除外),因为否则,它们将具有不确定的行为。这样便于进行优化,例如

  void transform(float * dst,const int * src,int n)
{
for(int i = 0; i dst [i] = src [i]; //可以展开并使用矢量指令
//如果使用dst和src别名,则结果将是错误的
}






gcc所说的是它放宽了规则,并允许通过联合进行类型修剪,即使标准不需要它也是如此

  union {
int64_t num;
struct {
int32_t hi,lo;
}部分;
} u = {42};
u.parts.hi = 420;

这是双关语gcc保证有效。其他情况可能似乎有用,但有一天可能会被打破。


I have a problem understanding what can and cannot be done using unions with GCC. I read the questions (in particular here and here) about it but they focus the C++ standard, I feel there's a mismatch between the C++ standard and the practice (the commonly used compilers).

In particular, I recently found confusing informations in the GCC online doc while reading about the compilation flag -fstrict-aliasing. It says:

-fstrict-aliasing

Allow the compiler to assume the strictest aliasing rules applicable to the language being compiled. For C (and C++), this activates optimizations based on the type of expressions. In particular, an object of one type is assumed never to reside at the same address as an object of a different type, unless the types are almost the same. For example, an unsigned int can alias an int, but not a void* or a double. A character type may alias any other type. Pay special attention to code like this:

union a_union {
  int i;
  double d;
};

int f() {
  union a_union t;
  t.d = 3.0;
  return t.i;
}

The practice of reading from a different union member than the one most recently written to (called "type-punning") is common. Even with -fstrict-aliasing, type-punning is allowed, provided the memory is accessed through the union type. So, the code above works as expected.

This is what I think I understood from this example and my doubts:

1) aliasing only works between similar types, or char

Consequence of 1): aliasing - as the word suggests - is when you have one value and two members to access it (i.e. the same bytes);

Doubt: are two types similar when they have the same size in bytes? If not, what are similar types?

Consequence of 1) for non similar types (whatever this means), aliasing does not work;

2) type punning is when we read a different member than the one we wrote to; it's common and it works as expected as long as the memory is accessed through the union type;

Doubt: is aliasing a specific case of type-punning where types are similar?

I get confused because it says unsigned int and double are not similar, so aliasing does not work; then in the example it's aliasing between int and double and it clearly says it works as expected, but calls it type-punning: not because types are or are not similar, but because it's reading from a member it did not write. But reading from a member it did not write is what I understood aliasing is for (as the word suggests). I'm lost.

The questions: can someone clarify the difference between aliasing and type-punning and what uses of the two techniques are working as expected in GCC? And what does the compiler flag do?

解决方案

Aliasing can be taken literally for what it means: it is when two different expressions refer to the same object. Type-punning is to "pun" a type, ie to use a object of some type as a different type.

Formally, type-punning is undefined behaviour with only a few exceptions. It happens commonly when you fiddle with bits carelessly

int mantissa(float f)
{
    return (int&)f & 0x7FFFFF;    // Accessing a float as if it's an int
}

The exceptions are (simplified)

  • Accessing integers as their unsigned/signed counterparts
  • Accessing anything as a char, unsigned char or std::byte

This is known as the strict-aliasing rule: the compiler can safely assume two expressions of different types never refer to the same object (except for the exceptions above) because they would otherwise have undefined behaviour. This facilitates optimizations such as

void transform(float* dst, const int* src, int n)
{
    for(int i = 0; i < n; i++)
        dst[i] = src[i];    // Can be unrolled and use vector instructions
                            // If dst and src alias the results would be wrong
}


What gcc says is it relaxes the rules a bit, and allows type-punning through unions even though the standard doesn't require it to

union {
    int64_t num;
    struct {
        int32_t hi, lo;
    } parts;
} u = {42};
u.parts.hi = 420;

This is the type-pun gcc guarantees will work. Other cases may appear to work but may one day silently be broken.

这篇关于实践中的联合,别名和类型对分:什么有效,什么无效?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆