通过指针访问C工会会员 [英] Accessing C union members via pointers

查看:121
本文介绍了通过指针访问C工会会员的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否通过指针访问工会成员,如下面的例子中,导致C99未定义的行为?这样做的目的似乎很清楚就够了,但我知道有关于走样和工会的一些限制。

 工会{INT I;焦炭℃; } U;为int * IP =&放大器; u.i;
字符* IC =安培; u.c;*的ip = 0;
* IC ='A';
的printf(%C \\ N,u.c);


解决方案

未指定的行为(从不确定略有不同),比一,这是最后一次写入其他任何元素访问工会。这是在C99附录J的详细资料:


  

以下是不确定的:结果
     :结果
     比存入(6.2.6.1)的最后一个其他工会成员的值结果


不过,由于您是通过指针ç写入,然后读 C ,这个特殊的例子的的明确界定。没关系的如何的你写的元素:

  u.c ='A'; //直接写。
*(及(u.c))='A'; //你们都要通过元素指针写的变化。
(安培; U) - > C ='A'; //通过结构指针写作。


有是已经在评论中提出一个问题,这似乎是矛盾的,至少表面上。用户 davmac 提供样品code:

  //使用-O3 -std = C99,例如编译:
//铛-O3 -std = C99 test.c的
// gcc的-O3 -std = C99 test.c的
//在铿锵V3.5.1,输出为123
//在GCC 4.8.4,输出为1073741824
//
//不同的输出,所以无论是:
// *程序调用未定义的行为;两种编译器是否正确
// *编译器厂商间preT标准或不同
// *一个编译器或其他有一个bug#包括LT&;&stdio.h中GT;工会ü
{
    INT I;
    浮F;
};INT someFunc(联盟U *起来,浮* FP)
{
    上调I标记= 123;
    * FP = 2.0; //这是否设置工会成员?
    返回向上I标记; //那么这不应该返回123!
}INT主(INT ARGC,字符** argv的)
{
    工会üuobj;
    的printf(%d个\\ N,someFunc(安培; uobj,&安培; uobj.f));
    返回0;
}

其输出上不同的编译器不同的值。不过,我认为,这是因为它实际上是违反了规则,在这里,因为它的的对会员˚F然后的读取的成员 I 和,如图附录J,这是不确定的。

的在脚注82 6.5.2.3 的规定:


  

如果用于访问一个联合对象的内容的部件是不一样的最后用来存储在对象的值的部件,该值的对象重新presentation的适当部分是reinter preTED作为新类型的对象重新presentation。


然而,由于这似乎违背附件J注释,这是一个脚注处理形式 XY 的前pressions的部分,也可以不申请通过一个指针访问

一,为什么走样应该是严格的主要原因是允许优化编译器更多的余地。为此,该标准决定了治疗不同类型的内存的书面是不确定的。

举例来说,可以考虑提供的功能:

  INT someFunc(联盟U *起来,浮* FP)
{
    上调I标记= 123;
    * FP = 2.0; //这是否设置工会成员?
    返回向上I标记; //那么这不应该返回123!
}

的实施是免费的假设,因为你不是的应该的别名内存,向上I标记 * FP 两个的不同的的对象。因此,它是免费的假设,你不改变上调>中值; I 后,将其设置为 123 所以它可以简单地返回 123 而不在实际变量的内容再次寻找。

相反,如果你改变了指针的设置语句:

 向上> F = 2.0;

那么这将使脚注82适用,返回值将是浮动的整数的重新间pretation。

为什么我不认为这是一个问题的问题,是因为你的写作,然后阅读的相同的类型,因此走样规则不发挥作用的原因。


这是有趣的是,在不确定的行为是由函数的本身的,但通过调用它因此不会引起:

 工会×最大;
INT X = someFunc(安培; U,及(up.f)); //< - 在这里混叠

如果你是不是把它称为这样:

 工会×最大;
向下浮动;
INT X = someFunc(安培; U,和向下的); //< - 不走样

这会的的是一个问题。

Does accessing union members via a pointer, as in the example below, result in undefined behavior in C99? The intent seems clear enough, but I know that there are some restrictions regarding aliasing and unions.

union { int i; char c; } u;

int  *ip = &u.i;
char *ic = &u.c;

*ip = 0;
*ic = 'a';
printf("%c\n", u.c);

解决方案

It is unspecified (subtly different from undefined) behaviour to access a union by any element other than the one that was last written. That's detailed in C99 annex J:

The following are unspecified:
   :
   The value of a union member other than the last one stored into (6.2.6.1).

However, since you are writing to c via the pointer, then reading c, this particular example is well defined. It does not matter how you write to the element:

u.c = 'a';        // direct write.
*(&(u.c)) = 'a';  // variation on yours, writing through element pointer.
(&u)->c = 'a';    // writing through structure pointer.


There is one issue that has been raised in comments which seems to contradict that, at least seemingly. User davmac provides sample code:

// Compile with "-O3 -std=c99" eg:
//  clang -O3 -std=c99 test.c
//  gcc -O3 -std=c99 test.c
// On clang v3.5.1, output is "123"
// On gcc 4.8.4, output is "1073741824"
//
// Different outputs, so either:
// * program invokes undefined behaviour; both compilers are correct OR
// * compiler vendors interpret standard differently OR
// * one compiler or the other has a bug

#include <stdio.h>

union u
{
    int i;
    float f;
};

int someFunc(union u * up, float *fp)
{
    up->i = 123;
    *fp = 2.0;     // does this set the union member?
    return up->i;  // then this should not return 123!
}

int main(int argc, char **argv)
{
    union u uobj;
    printf("%d\n", someFunc(&uobj, &uobj.f));
    return 0;
}

which outputs different values on different compilers. However, I believe that this is because it is actually violating the rules here because it writes to member f then reads member i and, as shown in Annex J, that's unspecified.

There is a footnote 82 in 6.5.2.3 which states:

If the member used to access the contents of a union object is not the same as the member last used to store a value in the object, the appropriate part of the object representation of the value is reinterpreted as an object representation in the new type.

However, since this seems to go against the Annex J comment and it's a footnote to the section dealing with expressions of the form x.y, it may not apply to accesses via a pointer.

One of the major reasons why aliasing is supposed to be strict is to allow the compiler more scope for optimisation. To that end, the standard dictates that treating memory of a different type to that written is unspecified.

By way of example, consider the function provided:

int someFunc(union u * up, float *fp)
{
    up->i = 123;
    *fp = 2.0;     // does this set the union member?
    return up->i;  // then this should not return 123!
}

The implementation is free to assume that, because you're not supposed to alias memory, up->i and *fp are two distinct objects. So it's free to assume that you're not changing the value of up->i after you set it to 123 so it can simply return 123 without looking at the actual variable contents again.

If instead, you changed the pointer setting statement to:

up->f = 2.0;

then that would make footnote 82 applicable and the returned value would be a re-interpretation of the float as an integer.

The reason why I don't think that's an issue for the question is because your writing then reading the same type, hence aliasing rules don't come into play.


It's interesting to note that the unspecified behaviour is caused not by the function itself, but by calling it thus:

union u up;
int x = someFunc (&u, &(up.f)); // <- aliasing here

If you were instead to call it so:

union u up;
float down;
int x = someFunc (&u, &down); // <- no aliasing

that would not be a problem.

这篇关于通过指针访问C工会会员的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆