海湾合作委员会,严格走样,并通过联合铸造 [英] gcc, strict-aliasing, and casting through a union

查看:138
本文介绍了海湾合作委员会,严格走样,并通过联合铸造的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

你有任何恐怖故事讲? GCC手册最近增加了对-fstrict走样,并通过工会铸造一个指针的警告:


  

[...]以地址,铸造结果指针和非关联化的结果中的未定义行为 [加着重],即使投使用的联合类型,例如:


 工会a_union {
        INT I;
        双D;
    };    INT f()的{
        双D = 3.0;
        收益率((工会a_union *)及D) - I标记;
    }

有没有人有一个例子来说明这种不确定的行为?

请注意这个问题的的有关C99标准说什么,或者不说。它是关于实际运转的 GCC ,然后其他现有的编译器,莫衷一是。

我只是猜测,但有潜在的问题可能会在 D 的设置骗3.0。因为 D 是一个临时变量,这是从来没有直接读取,并且永远不会通过有些兼容指针读取,编译器可能不会刻意去设置它。然后,F()会从栈返回一些垃圾。

我的简单,朴素,尝试失败。例如:

 的#include<&stdio.h中GT;工会a_union {
    INT I;
    双D;
};INT F1(无效){
    工会a_union吨;
    t.d = 3333333.0;
    返回t.i; // GCC手册:类型双关是允许的,只要......(C90 6.3.2.3)
}INT F2(无效){
    双D = 3333333.0;
    收益率((工会a_union *)及D) - I标记; // GCC手册:未定义行为
}诠释主要(无效){
    的printf(%d个\\ N,F1());
    的printf(%d个\\ N,F2());
    返回0;
}

正常工作,从而在Cygwin:

  -2147483648
-2147483648

综观汇编,我们看到的 GCC 完全优化 T 远: F1()只存储在pre计算的答案是:

  MOVL $ -2147483648,EAX%

,而 F2()推3333333.0到的浮点的协议栈,然后提取的返回值:

  FLDS LC0#LC0:1246458708(= 3333333.0)( - > 80位)
fstpl -8(%EBP)#保存在D(64位)
MOVL -8(%EBP),%eax中#返回值(32位)

和功能都还内联(这似乎是一些微妙的严格走样错误的原因),但这里不相关。 (这汇编是不是相关的,但它增加了确凿的细节。)

另外请注意,服用地址显然是错误的(或右键的,如果你试图说明未定义行为)。例如,正如我们知道这是错误的:

 的extern无效美孚(INT *,双*);
工会a_union吨;
t.d = 3.0;
富(安培; t.i,&安培; t.d); //未定义行为

我们也知道这是错误的:

 的extern无效美孚(INT *,双*);
双D = 3.0;
富(及((工会a_union *)及D) - I标记,和D); //未定义行为

有关此背景下讨论,参见例如:

<一个href=\"http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1422.pdf\">http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1422.pdf

<一href=\"http://gcc.gnu.org/ml/gcc/2010-01/msg00013.html\">http://gcc.gnu.org/ml/gcc/2010-01/msg00013.html

<一href=\"http://davmac.word$p$pss.com/2010/02/26/c99-revisited/\">http://davmac.word$p$pss.com/2010/02/26/c99-revisited/

<一href=\"http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html\">http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html

<一href=\"http://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule\">http://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule

<一href=\"http://stackoverflow.com/questions/2771023/c99-strict-aliasing-rules-in-c-gcc/2771041#2771041\">http://stackoverflow.com/questions/2771023/c99-strict-aliasing-rules-in-c-gcc/2771041#2771041

在第一个环节,一个ISO会议七个月前的记录草案,一位与会者注意到,第4.16节:


  

有没有人是认为规则是不够清楚吗?没有人真正能够跨preT他们。


其他说明:的我的测试是用gcc 4.3.4,与-O2;选项​​-O2和-O3暗示-fstrict走样。从GCC手册的例子假设的sizeof(双)> = 的sizeof(int)的;如果它们不相等也没关系。

另外,由麦克·阿克顿在cellperformace链接, -Wstrict走样= 2 ,但指出的 = 3 ,产生警告:提领型punned指针可能破坏严格走样规则为这里的例子


解决方案

这GCC警告有关工会没有的不一定是事实的意思是工会目前没有工作。但这里有比你一个稍微不那么简单的例子:

 的#include&LT;&stdio.h中GT;结构B {
    INT I1;
    INT I2;
};工会A {
    结构B B;
    双D;
};诠释主(){
    双D = 3.0;
    #IFDEF USE_UNION
        ((工会A *)和D) - GT; b.i2 + = 0x80000000的;
    #其他
        (为(int *)和D)[1] + = 0x80000000的;
    #万一
    的printf(%G \\ N,D);
}

输出:

  $ GCC --version
海湾合作委员会(GCC)4.3.4 20090804(释放)1
版权所有(C)2008自由软件基金会
这是自由软件;参见复印条件的来源。有否
保证;甚至不是针对特定目的的适销。$ GCC -oalias alias.c -O1 -std = C99&放大器;&安培; ./alias
-3$ GCC -oalias alias.c -O3 -std = C99&放大器;&安培; ./alias
3$ GCC -oalias alias.c -O1 -std = C99 -DUSE_UNION&放大器;&安培; ./alias
-3$ GCC -oalias alias.c -O3 -std = C99 -DUSE_UNION&放大器;&安培; ./alias
-3

因此​​,对GCC 4.3.4,结合保存一天(假设我希望输出-3)。它禁用依赖于严格走样而导致在所述第二壳体(只)的输出3的优化。随着-Wall,USE_UNION还禁用了类型双关语的警告。

我没有GCC 4.4来进行测试,但请给这个code一展身手。您code有效测试是否为存储 D 回读通过工会之前被初始化:它是否被修改我的测试

顺便说一句,安全的方式来阅读半双为int是:

 双D = 3;
INT I;
的memcpy(安培;我,和D,sizeof的我);
返回我;

随着GCC的优化,这会导致:

  INT的东西(){
401130:55推%EBP
401131:89 E5 MOV%ESP,EBP%
401133:83 EC 10分$ 0×10,ESP%
        双D = 3;
401136:D9 05 A8 20 40 0​​0 FLDS 0x4020a8
40113c:DD 5D F0 fstpl -0x10(EBP%)
        INT I;
        的memcpy(安培;我,和D,sizeof的我);
40113f:8B 45 F0 MOV -0x10(EBP%),%EAX
        返回我;
    }
401142:C9离开
401143:C3 RET

所以没有实际调用memcpy的。如果你不这样做,你值得你得到什么,如果工会的演员停止工作GCC; - )

Do you have any horror stories to tell? The GCC Manual recently added a warning regarding -fstrict-aliasing and casting a pointer through a union:

[...] Taking the address, casting the resulting pointer and dereferencing the result has undefined behavior [emphasis added], even if the cast uses a union type, e.g.:

    union a_union {
        int i;
        double d;
    };

    int f() {
        double d = 3.0;
        return ((union a_union *)&d)->i;
    }

Does anyone have an example to illustrate this undefined behavior?

Note this question is not about what the C99 standard says, or does not say. It is about the actual functioning of gcc, and other existing compilers, today.

I am only guessing, but one potential problem may lie in the setting of d to 3.0. Because d is a temporary variable which is never directly read, and which is never read via a 'somewhat-compatible' pointer, the compiler may not bother to set it. And then f() will return some garbage from the stack.

My simple, naive, attempt fails. For example:

#include <stdio.h>

union a_union {
    int i;
    double d;
};

int f1(void) {
    union a_union t;
    t.d = 3333333.0;
    return t.i; // gcc manual: 'type-punning is allowed, provided...' (C90 6.3.2.3)
}

int f2(void) {
    double d = 3333333.0;
    return ((union a_union *)&d)->i; // gcc manual: 'undefined behavior' 
}

int main(void) {
    printf("%d\n", f1());
    printf("%d\n", f2());
    return 0;
}

works fine, giving on CYGWIN:

-2147483648
-2147483648

Looking at the assembler, we see that gcc completely optimizes t away: f1() simply stores the pre-calculated answer:

movl    $-2147483648, %eax

while f2() pushes 3333333.0 onto the floating-point stack, and then extracts the return value:

flds   LC0                 # LC0: 1246458708 (= 3333333.0) (--> 80 bits)
fstpl  -8(%ebp)            # save in d (64 bits)
movl   -8(%ebp), %eax      # return value (32 bits)

And the functions are also inlined (which seems to be the cause of some subtle strict-aliasing bugs) but that is not relevant here. (And this assembler is not that relevant, but it adds corroborative detail.)

Also note that taking addresses is obviously wrong (or right, if you are trying to illustrate undefined behavior). For example, just as we know this is wrong:

extern void foo(int *, double *);
union a_union t;
t.d = 3.0;
foo(&t.i, &t.d); // undefined behavior

we likewise know this is wrong:

extern void foo(int *, double *);
double d = 3.0;
foo(&((union a_union *)&d)->i, &d); // undefined behavior

For background discussion about this, see for example:

http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1422.pdf
http://gcc.gnu.org/ml/gcc/2010-01/msg00013.html
http://davmac.wordpress.com/2010/02/26/c99-revisited/
http://cellperformance.beyond3d.com/articles/2006/06/understanding-strict-aliasing.html
http://stackoverflow.com/questions/98650/what-is-the-strict-aliasing-rule
http://stackoverflow.com/questions/2771023/c99-strict-aliasing-rules-in-c-gcc/2771041#2771041

In the first link, draft minutes of an ISO meeting seven months ago, one participant notes in section 4.16:

Is there anybody that thinks the rules are clear enough? No one is really able to interpret them.

Other notes: My test was with gcc 4.3.4, with -O2; options -O2 and -O3 imply -fstrict-aliasing. The example from the GCC Manual assumes sizeof(double) >= sizeof(int); it doesn't matter if they are unequal.

Also, as noted by Mike Acton in the cellperformace link, -Wstrict-aliasing=2, but not =3, produces warning: dereferencing type-punned pointer might break strict-aliasing rules for the example here.

解决方案

The fact that GCC is warning about unions doesn't necessarily mean that unions don't currently work. But here's a slightly less simple example than yours:

#include <stdio.h>

struct B {
    int i1;
    int i2;
};

union A {
    struct B b;
    double d;
};

int main() {
    double d = 3.0;
    #ifdef USE_UNION
        ((union A*)&d)->b.i2 += 0x80000000;
    #else
        ((int*)&d)[1] += 0x80000000;
    #endif
    printf("%g\n", d);
}

Output:

$ gcc --version
gcc (GCC) 4.3.4 20090804 (release) 1
Copyright (C) 2008 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

$ gcc -oalias alias.c -O1 -std=c99 && ./alias
-3

$ gcc -oalias alias.c -O3 -std=c99 && ./alias
3

$ gcc -oalias alias.c -O1 -std=c99 -DUSE_UNION && ./alias
-3

$ gcc -oalias alias.c -O3 -std=c99 -DUSE_UNION && ./alias
-3

So on GCC 4.3.4, the union "saves the day" (assuming I want the output "-3"). It disables the optimisation that relies on strict aliasing and that results in the output "3" in the second case (only). With -Wall, USE_UNION also disables the type-pun warning.

I don't have gcc 4.4 to test, but please give this code a go. Your code in effect tests whether the memory for d is initialised before reading back through a union: mine tests whether it is modified.

Btw, the safe way to read half of a double as an int is:

double d = 3;
int i;
memcpy(&i, &d, sizeof i);
return i;

With optimisation on GCC, this results in:

    int thing() {
401130:       55                      push   %ebp
401131:       89 e5                   mov    %esp,%ebp
401133:       83 ec 10                sub    $0x10,%esp
        double d = 3;
401136:       d9 05 a8 20 40 00       flds   0x4020a8
40113c:       dd 5d f0                fstpl  -0x10(%ebp)
        int i;
        memcpy(&i, &d, sizeof i);
40113f:       8b 45 f0                mov    -0x10(%ebp),%eax
        return i;
    }
401142:       c9                      leave
401143:       c3                      ret

So there's no actual call to memcpy. If you aren't doing this, you deserve what you get if union casts stop working in GCC ;-)

这篇关于海湾合作委员会,严格走样,并通过联合铸造的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆