gcc、严格混叠和通过联合进行强制转换 [英] gcc, strict-aliasing, and casting through a union

查看:27
本文介绍了gcc、严格混叠和通过联合进行强制转换的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

你有什么恐怖故事要讲吗?GCC 手册最近添加了一条关于 -fstrict-aliasing 和通过联合强制转换指针的警告:

Do you have any horror stories to tell? The GCC Manual recently added a warning regarding -fstrict-aliasing and casting a pointer through a union:

[...] 获取地址、转换结果指针并取消引用结果具有未定义的行为 [强调添加],即使转换使用联合类型,例如:

[...] Taking the address, casting the resulting pointer and dereferencing the result has undefined behavior [emphasis added], even if the cast uses a union type, e.g.:

    union a_union {
        int i;
        double d;
    };

    int f() {
        double d = 3.0;
        return ((union a_union *)&d)->i;
    }

有没有人可以举例说明这种未定义的行为?

Does anyone have an example to illustrate this undefined behavior?

请注意,这个问题不是关于 C99 标准说什么或没有说什么.它是关于 gcc 和其他现有编译器的实际功能.

Note this question is not about what the C99 standard says, or does not say. It is about the actual functioning of gcc, and other existing compilers, today.

我只是猜测,但一个潜在的问题可能在于 d 设置为 3.0.因为 d 是一个永远不会直接读取的临时变量,并且永远不会通过有点兼容"的指针读取,所以编译器可能不会费心去设置它.然后 f() 将从堆栈中返回一些垃圾.

I am only guessing, but one potential problem may lie in the setting of d to 3.0. Because d is a temporary variable which is never directly read, and which is never read via a 'somewhat-compatible' pointer, the compiler may not bother to set it. And then f() will return some garbage from the stack.

我的简单天真尝试失败了.例如:

My simple, naive, attempt fails. For example:

#include <stdio.h>

union a_union {
    int i;
    double d;
};

int f1(void) {
    union a_union t;
    t.d = 3333333.0;
    return t.i; // gcc manual: 'type-punning is allowed, provided...' (C90 6.3.2.3)
}

int f2(void) {
    double d = 3333333.0;
    return ((union a_union *)&d)->i; // gcc manual: 'undefined behavior' 
}

int main(void) {
    printf("%d
", f1());
    printf("%d
", f2());
    return 0;
}

工作正常,使用 CYGWIN:

works fine, giving on CYGWIN:

-2147483648
-2147483648

查看汇编器,我们看到 gcc 完全优化了 t:f1() 只是存储了预先计算的答案:

Looking at the assembler, we see that gcc completely optimizes t away: f1() simply stores the pre-calculated answer:

movl    $-2147483648, %eax

while f2() 将 3333333.0 压入 floating-point 堆栈,然后提取返回值:

while f2() pushes 3333333.0 onto the floating-point stack, and then extracts the return value:

flds   LC0                 # LC0: 1246458708 (= 3333333.0) (--> 80 bits)
fstpl  -8(%ebp)            # save in d (64 bits)
movl   -8(%ebp), %eax      # return value (32 bits)

并且这些函数也是内联的(这似乎是一些微妙的严格混叠错误的原因),但这与这里无关.(而且这个汇编器没有那么重要,但它增加了确凿的细节.)

And the functions are also inlined (which seems to be the cause of some subtle strict-aliasing bugs) but that is not relevant here. (And this assembler is not that relevant, but it adds corroborative detail.)

还请注意,获取地址显然是错误的(或者正确,如果您试图说明未定义的行为).例如,正如我们知道这是错误的:

Also note that taking addresses is obviously wrong (or right, if you are trying to illustrate undefined behavior). For example, just as we know this is wrong:

extern void foo(int *, double *);
union a_union t;
t.d = 3.0;
foo(&t.i, &t.d); // undefined behavior

我们同样知道这是错误的:

we likewise know this is wrong:

extern void foo(int *, double *);
double d = 3.0;
foo(&((union a_union *)&d)->i, &d); // undefined behavior

有关这方面的背景讨论,请参见示例:

For background discussion about this, see for example:

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1422.pdf
http://gcc.gnu.org/ml/gcc/2010-01/msg00013.html
http://davmac.wordpress.com/2010/02/26/c99-重访/
http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html
( = 在 Google 上搜索页面然后查​​看缓存页面)

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1422.pdf
http://gcc.gnu.org/ml/gcc/2010-01/msg00013.html
http://davmac.wordpress.com/2010/02/26/c99-revisited/
http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html
( = search page on Google then view cached page )

什么是严格的别名规则?
C++ (GCC) 中的 C99 严格别名规则

在第一个链接中,七个月前的 ISO 会议纪要草稿,一位与会者在第 4.16 节中指出:

In the first link, draft minutes of an ISO meeting seven months ago, one participant notes in section 4.16:

有没有人认为规则足够清晰?没有人能够真正理解它们.

Is there anybody that thinks the rules are clear enough? No one is really able to interpret them.

其他说明:我的测试是使用 gcc 4.3.4,使用 -O2;选项 -O2 和 -O3 暗示 -fstrict-aliasing.GCC 手册中的示例假定 sizeof(double) >= sizeof(int);不相等也没关系.

Other notes: My test was with gcc 4.3.4, with -O2; options -O2 and -O3 imply -fstrict-aliasing. The example from the GCC Manual assumes sizeof(double) >= sizeof(int); it doesn't matter if they are unequal.

此外,正如 Mike Acton 在 cellperformace 链接中所指出的,-Wstrict-aliasing=2不是 =3 会产生 <代码>警告:取消引用类型双关指针可能会破坏此处示例的严格别名规则.

Also, as noted by Mike Acton in the cellperformace link, -Wstrict-aliasing=2, but not =3, produces warning: dereferencing type-punned pointer might break strict-aliasing rules for the example here.

推荐答案

GCC 警告工会这一事实并不必然意味着工会目前不起作用.但这里有一个比你的稍微简单的例子:

The fact that GCC is warning about unions doesn't necessarily mean that unions don't currently work. But here's a slightly less simple example than yours:

#include <stdio.h>

struct B {
    int i1;
    int i2;
};

union A {
    struct B b;
    double d;
};

int main() {
    double d = 3.0;
    #ifdef USE_UNION
        ((union A*)&d)->b.i2 += 0x80000000;
    #else
        ((int*)&d)[1] += 0x80000000;
    #endif
    printf("%g
", d);
}

输出:

$ gcc --version
gcc (GCC) 4.3.4 20090804 (release) 1
Copyright (C) 2008 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ gcc -oalias alias.c -O1 -std=c99 && ./alias
-3

$ gcc -oalias alias.c -O3 -std=c99 && ./alias
3

$ gcc -oalias alias.c -O1 -std=c99 -DUSE_UNION && ./alias
-3

$ gcc -oalias alias.c -O3 -std=c99 -DUSE_UNION && ./alias
-3

所以在 GCC 4.3.4 上,联合拯救了一天"(假设我想要输出-3").它禁用依赖于严格混叠的优化,并在第二种情况下(仅)导致输出3".使用 -Wall,USE_UNION 还会禁用类型双关语警告.

So on GCC 4.3.4, the union "saves the day" (assuming I want the output "-3"). It disables the optimisation that relies on strict aliasing and that results in the output "3" in the second case (only). With -Wall, USE_UNION also disables the type-pun warning.

我没有要测试的 gcc 4.4,但请试一试这段代码.您的代码实际上测试了 d 的内存是否在通过联合读取之前被初始化:我的测试它是否被修改.

I don't have gcc 4.4 to test, but please give this code a go. Your code in effect tests whether the memory for d is initialised before reading back through a union: mine tests whether it is modified.

顺便说一句,将 double 的一半读取为 int 的安全方法是:

Btw, the safe way to read half of a double as an int is:

double d = 3;
int i;
memcpy(&i, &d, sizeof i);
return i;

在 GCC 上进行优化,结果是:

With optimisation on GCC, this results in:

    int thing() {
401130:       55                      push   %ebp
401131:       89 e5                   mov    %esp,%ebp
401133:       83 ec 10                sub    $0x10,%esp
        double d = 3;
401136:       d9 05 a8 20 40 00       flds   0x4020a8
40113c:       dd 5d f0                fstpl  -0x10(%ebp)
        int i;
        memcpy(&i, &d, sizeof i);
40113f:       8b 45 f0                mov    -0x10(%ebp),%eax
        return i;
    }
401142:       c9                      leave
401143:       c3                      ret

所以没有实际调用 memcpy.如果您不这样做,那么如果联合强制转换在 GCC 中停止工作,您应该得到什么 ;-)

So there's no actual call to memcpy. If you aren't doing this, you deserve what you get if union casts stop working in GCC ;-)

这篇关于gcc、严格混叠和通过联合进行强制转换的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆