ISO C中的严格别名规则,有人理解它们吗? [英] strict aliasing rules in ISO C, someone understands them ?

查看:80
本文介绍了ISO C中的严格别名规则,有人理解它们吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



我尝试理解C标准中严格的别名规则。

由于gcc默认应用这些规则,我只想确保

对于问题(1),(2)和(3),我认为答案都是是,

但我很乐意得到强烈的确认。


关于问题(4),(5)和(6),我真的不知道。请帮忙 ! !




--------


标准说(
http://www.open- std.org/jtc1/sc22/wg...docs/n1124.pdf 第6.5章

):


一个对象应该有它的存储值只能通过左值访问

表达式具有以下类型之一



- 与有效类型兼容的类型对象,

- 与有效类型兼容的类型的限定版本
对象的


- 签名的类型或对应于

对象有效类型的无符号类型,

- 对应于

限定版本的有符号或无符号类型对象的有效类型,

- 聚合或联合类型,其成员中包含上述

类型之一

(包括,再说一遍ively,一个subaggregate的成员或包含

联盟),或

- 一个字符类型。

*****问题(1 )*****


让我们有两个具有不同标签名称的结构,例如:


struct s1 {int i; };

struct s2 {int i;};


struct s1 * p1;

struct s2 * p2;


编译器可以自由地假设p1和p2指向不同的内存

位置而不是别名。

两个结构具有不同名称被认为是不同的类型。


在标准中,我们多次阅读对象的有效类型的措辞

。 br />

这个对象的有效类型可以是int,double等,但是

也可以是struct。类型,对吧???


我想它也可能是一个数组。类型或联合键入

好​​,是不是正确???

*****问题(2)*****


在随后的小程序中,行printf("%d \ n",* x);"

通常返回123,

但优化编译器可以返回垃圾而不是123.

我的推理是否正确???


另一方面,行printf(" ;%d \ n,p1-> i);总是按预期返回999

,对吧???


----


#include< stdio.h>

#include< stdlib.h>


struct s1 {int i;双f; };

int main(无效)

{

struct s1 * p1;

int * x;


p1 = malloc(sizeof(* p1));

p1-> i = 123; //'struct s1'类型的对象包含123


x =&(p1-> i);


printf( %d \ n,* x); //我尝试访问存储在''struct s1'类型的

对象中的值

//通过* x,类型为''int'' 。

//我认为

标准不允许这样做!


* x = 999; //我在* x中存储999,类型为''int''


printf("%d \ n",p1-> i); //我访问存储在* x中的值

类型''int''

// by * p1(因为p1-> i是一个快捷方式for

(* p1).i)

//类型''struct s1'',

//但包含一个类型为''int''的成员。

//我认为标准允许这样做。

返回0;

}

*****问题(3)*****


标准禁止(如果我没记错的话)类型为struct A的指针br />
*"访问由类型为struct B *的指针写入的数据,因为它们是

不同的类型。


这意味着通用的用法伪造继承在C中像

这个代码片段现在完全错了,是不是正确???

--- myfile.c ---


#include< stdio.h>

#include< stdlib.h>


typedef enum {RED,BLUE, GREEN}颜色;


struct Point {int x;

int y;

};

struct Color_Point {int x;

int y;

颜色;

};

struct Color_Point2 {struct Point point;

颜色;

};


int main(int argc ,char * argv [])

{


struct Point * p;


struct Color_Point * my_color_point = malloc(sizeof(struct

Color_Point));

my_color_point-> x = 10;

my_color_point-> y = 20;

m y_color_point-> color = GREEN;


p =(struct Point *)my_color_point;


printf(" x:%d,y :%d \ n,p-> x,p-> y); //试图访问存储在

a" struct Color_Point"中的数据使用struct Point *的对象指针是

禁止标准???

struct Color_Point2 * my_color_point2 = malloc(sizeof(struct

Color_Point2));

my_color_point2-> point.x = 100;

my_color_point2-> point.y = 200;

my_color_point2-> color = RED;


p =(struct Point *)my_color_point2;


printf(" x:%d,y:%d \ n",p - > x,p-> y); //试图访问存储在

a" struct Color_Point2"中的数据使用struct Point *的对象指针是标准禁止的


p =& my_color_point2->点;


printf(" x:%d,y:%d \ n,p-> x,p-> y); //但这是正确的,对吧???

返回0;

}

是行p =(struct Point * )my_color_point"还有一个案例是什么是

,称为类型惩罚。 ???

*****问题(4)*****


在标准,第6.5.2.3章中,写成:


为了简化工会的使用,我们提供了一项特殊保证:

如果工会包含

几个共享的结构一个共同的初始序列(见下文),

如果工会

对象目前包含其中一个结构,则允许

检查常见的

其中任何一个的初始部分,声明完整的

类型的联盟是可见的。两个结构共享一个共同的初始序列如果

对应的成员有一个序列类型(和,对于位字段,相同的宽度)的序列

$一个或多个

的初始会员。


我觉得这句话完全模糊不清。


让'我有:


struct s1 {int i;};

struct s2 {int i;};


struct s1 * p1;

struct s2 * p2;


编译器可以自由地假设* p1和* p2不是别名。


如果我们只是在这段代码之前加上这样的联合声明,那么它就像编译器的标志一样,表示指向struct的指针

s1"和指向struct s2的指针(这里,p1和p2)可以别名并将

指向同一位置。


union p1_p2_alias_flag {struct s1 st1;

struct s2 st2;

};


无需使用union p1_p2_alias_flag用于访问数据,

和p1_p2_alias_flag,st1和st2只是虚拟名称,未使用

其他地方。

我的意思是,可以直接使用p1和p2访问数据。


你是否同意,每个人???

*****问题(5)*****


这个问题真的很难。


让我们有这个代码片段:


---------

#include< stdio.h>


int main(无效)

{


struct s1 { int i;

};


struct s1 s = {77};


unsigned char * x = (unsigned char *)& s;

printf("%d%d%d%d \ n",(int)x [0],(int)x [1], (int)x [2],(int)x [3]);

//标准表示存储在struct s1中的数据。类型可以通过指针读取

到char


x [0] = 100; //这里,我在char中写入数据对象!!!

x [1] = 101;

x [2] = 102;

x [3] = 103;


printf("%d \ n",si); //但是存储在char中的数据对象不能通过指向struct s1的指针读取
???


返回0;

}

-----------


对于行printf("%d%d%d%d \ n,(int)x [0],(int)x [1],(int)x [2],

(int)x [3]);,我可以像这样重写标准条款:


一个对象[这里,类型为struct s1的类型]只有左值表达式才能获得其存储值

以下类型之一:

[blah blah blah]

- 一个字符类型[在我们的例子中,x [0],x [1],x [2],x [3]]。 //

这是我们的情况,所以到目前为止一切都还行!

但是行怎么样printf("%d \ nn,si) ;" ??????

我一遍又一遍地阅读标准,但我无法表达如何才能工作。

如果我改写标准条款,它给出:


一个对象[在我们的例子中,x [0],x [1],x [2]和x [3]]应该有它的

存储值只能通过一个左值表达式访问,该表达式具有

以下类型之一:

- 与有效类型兼容的类型对象,[这是

不是我们的情况]

- 与有效类型兼容的类型的合格版本
对象的
,[仍然不是我们的情况]

- 一种类型的签名或无符号类型对应于

有效类型的对象,[仍然不是我们的情况]

- 一种有符号或无符号类型,对应于对象有效类型的

限定版本,[仍然不是我们的

案例]

- 包含上述

t之一的聚合或联合类型其成员中的ypes [我们通读s它的类型为struct

s1,但它不包含类型为char的成员。 ]

(包括,递归地,一个子集合的成员或包含

联盟),或

- 一个字符类型。 [绝对不是我们的情况]


我们看到这些条件都不适用于我们的情况。


我的推理缺陷在哪里? ??

最后一个printf是什么?这段代码片段的工作与否?和

为什么???

*****问题(6)*****


我经常看到此代码用于套接字编程:


struct sockaddr_in my_addr;

...

bind(sockfd,(struct sockaddr * )& my_addr,sizeof(struct sockaddr));


函数bind(...)需要一个指向struct sockaddr的指针,但

my_addr是一个struct sockaddr_in。

因此,在我看来,函数bind不能保证安全访问

对象my_addr的内容。


有人知道为什么这段代码没有被破坏(或者是不是这样)???


I try to understand strict aliasing rules that are in the C Standard.
As gcc applies these rules by default, I just want to be sure to
understand fully this issue.

For questions (1), (2) and (3), I think that the answers are all "yes",
but I would be glad to have strong confirmation.

About questions (4), (5) and (6), I really don''t know. Please help ! !
!

--------

The Standard says (
http://www.open-std.org/jtc1/sc22/wg...docs/n1124.pdf chapter 6.5
):

An object shall have its stored value accessed only by an lvalue
expression that has one of
the following types:
- a type compatible with the effective type of the object,
- a qualified version of a type compatible with the effective type
of the object,
- a type that is the signed or unsigned type corresponding to the
effective type of the object,
- a type that is the signed or unsigned type corresponding to a
qualified version of the effective type of the object,
- an aggregate or union type that includes one of the aforementioned
types among its members
(including, recursively, a member of a subaggregate or contained
union), or
- a character type.
***** Question (1) *****

Let''s have two struct having different tag names, like:

struct s1 {int i;};
struct s2 {int i;};

struct s1 *p1;
struct s2 *p2;

The compiler is free to assume that p1 and p2 point to different memory
locations and don''t alias.
Two struct having different names are considered to be different types.

In the standard, we read the wording "effective type of the object"
many times.

This "effective type of the object" may be an "int", "double", etc, but
may also be a "struct" type, right ???

And I suppose it may also be an "array" type or an "union" type as
well, is it correct ???
***** Question (2) *****

In the little program that follows, the line "printf("%d\n", *x);"
normally returns 123,
but an optimizing compiler can return garbage instead of 123.
Is my reasoning correct ???

On the other side, the line "printf("%d\n", p1->i);" always returns 999
as expected, right ???

----

#include <stdio.h>
#include <stdlib.h>

struct s1 { int i; double f; };
int main(void)
{
struct s1* p1;
int* x;

p1 = malloc(sizeof(*p1));
p1->i = 123; // object of type ''struct s1'' contains 123

x = &(p1->i);

printf("%d\n", *x); // I try to access a value stored in an
object of type ''struct s1''
// through *x which is of type ''int''.
// I think this is not allowed by the
standard !

*x = 999; // I store 999 in *x, which is of type ''int''

printf("%d\n", p1->i); // I access a value stored in *x which is of
type ''int''
// by *p1 ( as p1->i is a shortcut for
(*p1).i )
// which is of type ''struct s1'',
// but contains a member of type ''int''.
// I think this is allowed by the standard.
return 0;
}
***** Question (3) *****

The Standard forbids ( if I am not mistaken ) pointer of type "struct A
*" to access data written by a pointer of type "struct B *", as the are
different types.

This means that the common usage of faking inheritance in C like in
this code sniplet is now utterly wrong, is it correct ???
--- myfile.c ---

#include <stdio.h>
#include <stdlib.h>

typedef enum { RED, BLUE, GREEN } Color;

struct Point { int x;
int y;
};

struct Color_Point { int x;
int y;
Color color;
};

struct Color_Point2{ struct Point point;
Color color;
};

int main(int argc, char* argv[])
{

struct Point* p;

struct Color_Point* my_color_point = malloc(sizeof(struct
Color_Point));
my_color_point->x = 10;
my_color_point->y = 20;
my_color_point->color = GREEN;

p = (struct Point*)my_color_point;

printf("x:%d, y:%d\n", p->x, p->y); // trying to access data stored in
a "struct Color_Point" object using a "struct Point*" pointer is
forbidden by the Standard ???
struct Color_Point2* my_color_point2 = malloc(sizeof(struct
Color_Point2));
my_color_point2->point.x = 100;
my_color_point2->point.y = 200;
my_color_point2->color = RED;

p = (struct Point*)my_color_point2;

printf("x:%d, y:%d\n", p->x, p->y); // trying to access data stored in
a "struct Color_Point2" object using a "struct Point*" pointer is
forbidden by the Standard ???
p = &my_color_point2->point;

printf("x:%d, y:%d\n", p->x, p->y); // but this is correct, right ???
return 0;
}
Is the line "p = (struct Point*)my_color_point" also a case of what is
called "type-punning" ???
***** Question (4) *****

In the Standard, chapter 6.5.2.3, it is written:

One special guarantee is made in order to simplify the use of unions:
if a union contains
several structures that share a common initial sequence (see below),
and if the union
object currently contains one of these structures, it is permitted to
inspect the common
initial part of any of them anywhere that a declaration of the complete
type of the union is
visible. Two structures share a common initial sequence if
corresponding members have
compatible types (and, for bit-fields, the same widths) for a sequence
of one or more
initial members.

I find this statement completely obscure.

Let''s have:

struct s1 {int i;};
struct s2 {int i;};

struct s1 *p1;
struct s2 *p2;

A compiler is free to assume that *p1 and *p2 don''t alias.

If we just put a union declaration like this before this code, then it
acts like a flag to the compiler, indicating that pointers to "struct
s1" and pointers to "struct s2" ( here, p1 and p2 ) may alias and point
to the same location.

union p1_p2_alias_flag { struct s1 st1;
struct s2 st2;
};

There is no need to use "union p1_p2_alias_flag" for accessing data,
and "p1_p2_alias_flag", "st1" and "st2" are just dummy names, not used
anywhere else.
I mean, it is possible to access data using directly p1 and p2.

Do you agree, everybody ???
***** Question (5) *****

This question is really hard.

Let''s have this code sniplet:

---------
#include <stdio.h>

int main (void)
{

struct s1 {int i;
};

struct s1 s = {77};

unsigned char* x = (unsigned char*)&s;
printf("%d %d %d %d\n", (int)x[0], (int)x[1], (int)x[2], (int)x[3]);
// Standard says data stored in "struct s1" type can be read by pointer
to "char"

x[0] = 100; // here, I write data in "char" objects !!!
x[1] = 101;
x[2] = 102;
x[3] = 103;

printf("%d\n", s.i); // but data stored in "char" objects cannot be
read by pointer to "struct s1" ???

return 0;
}
-----------

For the line "printf("%d %d %d %d\n", (int)x[0], (int)x[1], (int)x[2],
(int)x[3]);", I can rewrite the Standard clause like this:

An object [ here, s of type "struct s1" ] shall have its stored value
accessed only by an lvalue expression that has one of
the following types:
[ blah blah blah ]
- a character type [ in our example, x[0], x[1], x[2], x[3] ]. //
it is our case, so everything is OK so far !
But what about the line "printf("%d\n", s.i);" ??????
I read the Standard again and again, but I cannot express how is can
work.
If I rewrite the Standard clause, it gives:

An object [ in our example, x[0], x[1], x[2], and x[3] ] shall have its
stored value accessed only by an lvalue expression that has one of
the following types:
- a type compatible with the effective type of the object, [ this is
not our case ]
- a qualified version of a type compatible with the effective type
of the object, [ still not our case ]
- a type that is the signed or unsigned type corresponding to the
effective type of the object, [ still not our case ]
- a type that is the signed or unsigned type corresponding to a
qualified version of the effective type of the object, [ still not our
case ]
- an aggregate or union type that includes one of the aforementioned
types among its members [ we read through "s" which is of type "struct
s1", but it does not contain a member of type "char" ]
(including, recursively, a member of a subaggregate or contained
union), or
- a character type. [ definitely not our case ]

We see that none of these conditions applies in our case.

Where is the flaw in my reasoning ???
Does the last "printf" line of this code sniplet work or not ??? and
why ???
***** Question (6) *****

I often see this code used with socket programming:

struct sockaddr_in my_addr;
...
bind(sockfd, (struct sockaddr *)&my_addr, sizeof(struct sockaddr));

The function bind(...) needs a pointer to "struct sockaddr", but
my_addr is a "struct sockaddr_in".
So, in my opinion, the function bind is not guaranteed to access safely
the content of object my_addr.

Someone knows why this code is not broken ( or if it is ) ???

推荐答案

文章< 11 ********************** @ f14g2000cwb.googlegroups .com>,
ni ************ @ genevoise.ch 写道:
In article <11**********************@f14g2000cwb.googlegroups .com>,
ni************@genevoise.ch wrote:
*****问题(2)*****

在随后的小程序中,行printf("%d) \\ n",* x);
通常返回123,
但优化编译器可以返回垃圾而不是123.
我的推理是否正确???
----

#include< stdio.h>
#include< stdlib.h>

struct s1 {int i;双f; };

int main(void)
结构s1 * p1;
int * x;

p1 = malloc(sizeof (* p1));
p1-> i = 123; //'struct s1'类型的对象包含123

(p1-> i);

printf("%d \ n" ;, *X); //我尝试访问存储在类型''struct s1'
//通过* x的类型为''int''的
对象中的值。
//我认为
标准不允许这样做!

* x = 999; //我在* x中存储999,类型为''int''

printf("%d \ n",p1-> i); //我访问存储在* x中的值,其类型为''int''
// by * p1(因为p1-> i是
的快捷方式(* p1) ).i)
//类型为''struct s1'',
//但包含'int'类型的成员。
//我认为这是允许的按标准。

返回0;
}


这一切都还可以。结构的唯一不寻常的事情是可以有

填充,并且存储到任何结构成员可以修改结构中的任何

填充。如果int i和double f之间有填充,

那么p1-> i = 123可以修改填充,而* x = 999则不能。


*****问题(3)*****

标准禁止(如果我没有记错的话)类型为struct A
*的指针访问由类型为struct B *的指针写入的数据,因为它们是不同的类型。

这意味着伪造继承在C中的常见用法如同
--- myfile.c ---

#include< stdio.h>
#include< stdlib.h>

typedef enum {RED,BLUE,GREEN}颜色;

struct Point {int x;
int y ;
};

struct Color_Point {int x;
int y;
颜色;
};

结构Color_Point2 {struct Point point;
颜色;
};
int main(int argc,char * argv [])
{

struct Point * p;

struct Color_Point * my_color_point = malloc(sizeof(struct
Color_Point));
my_color_point-> x = 10;
my_color_point - > y = 20;
my_color_point-> color = GREEN;

p =(struct Point *)my_color_point;


这是未定义的行为。无法保证my_color_point对于类型指针(struct Point *)正确对齐。

printf(" x:%d,y:%d \\ n) \\ n",p-> x,p-> y); //试图访问存储在
struct Color_Point中的数据使用struct Point *的对象指针是否被标准禁止?


是的。有一个例外:如果编译器看到了一个声明与某个类型为struct Point的成员的

联合。和struct Color_Point,然后

访问公共成员两个结构的初始成员是合法的;

甚至写入一个结构的成员并作为成员读取

另一个结构。

struct Color_Point2 * my_color_point2 = malloc(sizeof(struct
Color_Point2));
my_color_point2-> point.x = 100 ;
my_color_point2-> point.y = 200;
my_color_point2-> color = RED;

p =(struct Point *)my_color_point2;


是的,你总是可以将指向struct的指针转换成第一个

成员的指针。

printf(" x:%d,y:%d \ n,p-> x,p-> y); //试图访问存储在
struct Color_Point2中的数据使用struct Point *的对象指针是否被标准禁止?


那没关系。

p =& my_color_point2-> point;

printf(" x:% d,y:%d \ n,p-> x,p-> y); //但这是正确的,对吧???

返回0;
}


是行p =(struct Point * )my_color_point"还有一个叫什么叫做打字的案例。 ????

*****问题(4)*****

在标准的6.5.2.3章中,写的是:

为了简化工会的使用,我们做了一个特别的保证:
如果一个工会包含几个共享一个共同初始序列的结构(见下文),
如果union
对象目前包含其中一个结构,允许在任何地方检查其中任何一个的常见原始部分,即声明完整的联合类型
可见。如果相应的成员具有相同的类型(并且对于位字段,相同的宽度),则两个结构共享一个共同的初始序列,用于一个或多个初始成员的序列。

我发现这个陈述完全模糊不清。

让我们有:

struct s1 {int i;};
struct s2 {int i;};

struct s1 * p1;
struct s2 * p2;

编译器可以自由地假设* p1和* p2不要别名。


完全正确。

如果我们在此代码之前放置这样的联合声明,那么它就像编译器的标志一样,表明指向struct
s1的指针和指向struct s2的指针(这里,p1和p2)可以别名并指向同一位置。

union p1_p2_alias_flag {struct s1 st1;
struct s2 st2;
};

没有必要使用union p1_p2_alias_flag。用于访问数据,
和p1_p2_alias_flag,st1。和st2只是虚拟名称,在其他任何地方都没有使用。
我的意思是,可以直接使用p1和p2访问数据。


是的,这是对的。


*****问题(5)*****

让我们有这个代码片段:

---------
#include< stdio.h>

int main(无效)
{/>
struct s1 {int i;
};

struct s1 s = {77};

unsigned char * x =(unsigned char *)& s;
printf("%d%d%d%d \ n" ,(int)x [0],(int)x [1],(int)x [2],(int)x [3]);
//标准表示存储在struct s1中的数据。类型可以通过指针读取
到char


即如果sizeof(int)> = 4,也无法保证。


x [0] = 100; //这里,我在char中写入数据对象!!!
x [1] = 101;
x [2] = 102;
x [3] = 103;

printf("% d \ n,si); //但是存储在char中的数据对象不能通过指向struct s1的指针读取。 ???


假设sizeof(int)== 4,你已经完全改变了

中x的表示形式。如果表示不是陷阱

表示,你没事。如果例如

结果在存储三个字节后结合最后剩余的字节

的数字77是一个陷阱表示甚至是好的,因为你永远不会访问

该值。

返回0;
}


对于行printf("%d) %d%d%d \ n,(int)x [0],(int)x [1],(int)x [2],
(int)x [3]); ,我可以像这样重写Standard子句:

一个对象[这里,s类型为'struct s1" ]只有具有以下类型之一的左值表达式才能访问其存储值:
[blah blah blah]
- 字符类型[在我们的示例中, x [0],x [1],x [2],x [3]]。 //
这是我们的情况,所以到目前为止一切都还行!

但是行如何printf("%d \ nn,s.i);" ??????
我一遍又一遍地阅读标准,但我无法表达如何才能工作。


如果存储的字节是int的有效表示,那么它打印的是

。如果不是,则是未定义的行为。一个特定的编译器

可能保证int'没有陷阱表示。

*****问题(6)*****
<我经常看到这个代码用于套接字编程:

struct sockaddr_in my_addr;
...
bind(sockfd,(struct sockaddr *)& my_addr,sizeof (struct sockaddr));

函数bind(...)需要一个指向struct sockaddr的指针,但是
my_addr是一个struct sockaddr_in。
所以,在我看来,函数bind并不能保证安全访问对象my_addr的内容。

有人知道为什么这段代码没有被破坏(或者如果它)?
***** Question (2) *****

In the little program that follows, the line "printf("%d\n", *x);"
normally returns 123,
but an optimizing compiler can return garbage instead of 123.
Is my reasoning correct ???

On the other side, the line "printf("%d\n", p1->i);" always returns 999
as expected, right ???

----

#include <stdio.h>
#include <stdlib.h>

struct s1 { int i; double f; };
int main(void)
{
struct s1* p1;
int* x;

p1 = malloc(sizeof(*p1));
p1->i = 123; // object of type ''struct s1'' contains 123

x = &(p1->i);

printf("%d\n", *x); // I try to access a value stored in an
object of type ''struct s1''
// through *x which is of type ''int''.
// I think this is not allowed by the
standard !

*x = 999; // I store 999 in *x, which is of type ''int''

printf("%d\n", p1->i); // I access a value stored in *x which is of
type ''int''
// by *p1 ( as p1->i is a shortcut for
(*p1).i )
// which is of type ''struct s1'',
// but contains a member of type ''int''.
// I think this is allowed by the standard.
return 0;
}
This is all ok. The only unusual thing with structs is that there can be
padding, and that storing into any struct member could modify any
padding in the struct. If there is padding between int i and double f,
then p1->i = 123 could modify the padding, while *x = 999 couldn''t.

***** Question (3) *****

The Standard forbids ( if I am not mistaken ) pointer of type "struct A
*" to access data written by a pointer of type "struct B *", as the are
different types.

This means that the common usage of faking inheritance in C like in
this code sniplet is now utterly wrong, is it correct ???
--- myfile.c ---

#include <stdio.h>
#include <stdlib.h>

typedef enum { RED, BLUE, GREEN } Color;

struct Point { int x;
int y;
};

struct Color_Point { int x;
int y;
Color color;
};

struct Color_Point2{ struct Point point;
Color color;
};

int main(int argc, char* argv[])
{

struct Point* p;

struct Color_Point* my_color_point = malloc(sizeof(struct
Color_Point));
my_color_point->x = 10;
my_color_point->y = 20;
my_color_point->color = GREEN;

p = (struct Point*)my_color_point;
This is undefined behavior. There is no guarantee that my_color_point is
correctly aligned for a pointer of type (struct Point *).
printf("x:%d, y:%d\n", p->x, p->y); // trying to access data stored in
a "struct Color_Point" object using a "struct Point*" pointer is
forbidden by the Standard ???
Yes. There is an exception: If the compiler has seen a declaration of a
union with members of type "struct Point" and "struct Color_Point", then
accessing the common members initial members of both structs is legal;
even writing to a member of one struct and reading as a member of
another struct.
struct Color_Point2* my_color_point2 = malloc(sizeof(struct
Color_Point2));
my_color_point2->point.x = 100;
my_color_point2->point.y = 200;
my_color_point2->color = RED;

p = (struct Point*)my_color_point2;
Yes, you can always cast a pointer to struct to a pointer of the first
member.
printf("x:%d, y:%d\n", p->x, p->y); // trying to access data stored in
a "struct Color_Point2" object using a "struct Point*" pointer is
forbidden by the Standard ???
That''s fine.
p = &my_color_point2->point;

printf("x:%d, y:%d\n", p->x, p->y); // but this is correct, right ???
return 0;
}
Is the line "p = (struct Point*)my_color_point" also a case of what is
called "type-punning" ???
***** Question (4) *****

In the Standard, chapter 6.5.2.3, it is written:

One special guarantee is made in order to simplify the use of unions:
if a union contains
several structures that share a common initial sequence (see below),
and if the union
object currently contains one of these structures, it is permitted to
inspect the common
initial part of any of them anywhere that a declaration of the complete
type of the union is
visible. Two structures share a common initial sequence if
corresponding members have
compatible types (and, for bit-fields, the same widths) for a sequence
of one or more
initial members.

I find this statement completely obscure.

Let''s have:

struct s1 {int i;};
struct s2 {int i;};

struct s1 *p1;
struct s2 *p2;

A compiler is free to assume that *p1 and *p2 don''t alias.
Exactly.
If we just put a union declaration like this before this code, then it
acts like a flag to the compiler, indicating that pointers to "struct
s1" and pointers to "struct s2" ( here, p1 and p2 ) may alias and point
to the same location.

union p1_p2_alias_flag { struct s1 st1;
struct s2 st2;
};

There is no need to use "union p1_p2_alias_flag" for accessing data,
and "p1_p2_alias_flag", "st1" and "st2" are just dummy names, not used
anywhere else.
I mean, it is possible to access data using directly p1 and p2.
Yes, that is right.

***** Question (5) *****

This question is really hard.

Let''s have this code sniplet:

---------
#include <stdio.h>

int main (void)
{

struct s1 {int i;
};

struct s1 s = {77};

unsigned char* x = (unsigned char*)&s;
printf("%d %d %d %d\n", (int)x[0], (int)x[1], (int)x[2], (int)x[3]);
// Standard says data stored in "struct s1" type can be read by pointer
to "char"
That is if sizeof (int) >= 4, which is nowhere guaranteed.

x[0] = 100; // here, I write data in "char" objects !!!
x[1] = 101;
x[2] = 102;
x[3] = 103;

printf("%d\n", s.i); // but data stored in "char" objects cannot be
read by pointer to "struct s1" ???
Assuming that sizeof (int) == 4, you have changed exactly every bit in
the representation of x. If the representation is not a trap
representation, you are fine. And it is even ok if for example the
result after storing three bytes, combined with the last remaining byte
of the number 77 were a trap representation, because you never access
that value.
return 0;
}
For the line "printf("%d %d %d %d\n", (int)x[0], (int)x[1], (int)x[2],
(int)x[3]);", I can rewrite the Standard clause like this:

An object [ here, s of type "struct s1" ] shall have its stored value
accessed only by an lvalue expression that has one of
the following types:
[ blah blah blah ]
- a character type [ in our example, x[0], x[1], x[2], x[3] ]. //
it is our case, so everything is OK so far !
But what about the line "printf("%d\n", s.i);" ??????
I read the Standard again and again, but I cannot express how is can
work.
If the bytes stored are a valid representation of an int, then that is
what it prints. If not, it is undefined behavior. A specific compiler
might guarantee that int''s have no trap representations.
***** Question (6) *****

I often see this code used with socket programming:

struct sockaddr_in my_addr;
...
bind(sockfd, (struct sockaddr *)&my_addr, sizeof(struct sockaddr));

The function bind(...) needs a pointer to "struct sockaddr", but
my_addr is a "struct sockaddr_in".
So, in my opinion, the function bind is not guaranteed to access safely
the content of object my_addr.

Someone knows why this code is not broken ( or if it is ) ???




取决于所涉及类型的声明。请记住,

C标准并不是唯一的标准。例如,C Standard不会保证''a''+ 1 ==''b'',但如果您的C实现使用ASCII

或Unicode的字符集,然后ASCII标准或Unicode

标准将为您提供保证。


在您的情况下,可能是POSIX保证代码是

正确。所以它适用于符合

POSIX标准的任何实现(无论它是否符合C标准),

,即使它可能不起作用一个符合

C标准而不是POSIX的实现。



Depends on the declarations of the types involved. And remember that the
C Standard is not the only standard. For example, C Standard doesn''t
guarantee that ''a'' + 1 == ''b'', but if your C implementation uses ASCII
or Unicode for its character set, then the ASCII standard or the Unicode
standard would give you that guarantee.

In your case, it could be that POSIX guarantees that the code is
correct. So it will work on any implementation that conforms to the
POSIX standard (no matter whether it conforms to the C Standard or not),
even though it might not work on an implementation that conforms to the
C Standard but not to POSIX.


2005年10月13日07:39:48 -0700, ni ************ @ genevoise.ch

comp.lang.c:
On 13 Oct 2005 07:39:48 -0700, ni************@genevoise.ch wrote in
comp.lang.c:

我尝试理解C标准中的严格别名规则。
因为gcc适用这些规则默认情况下,我只想确保完全理解这个问题。

对于问题(1),(2)和(3),我认为答案都是是的,
但我很乐意得到强烈的确认。

关于问题(4),(5)和(6),我真的不知道。请帮忙 ! !




标准说(
http://www.open-std.org/jtc1/sc22/wg ... docs / n1124.pdf 第6.5章

一个对象的存储值只能通过一个含有
的左值表达式来访问以下类型:
- 与对象的有效类型兼容的类型,
- 与对象的有效类型兼容的类型的合格版本,
- a类型是对应于对象的有效类型的有符号或无符号类型,
- 对应于有效类型的有效类型的有符号或无符号类型的类型。 object,
- 聚合或联合类型,包括其成员中的上述类型之一
(包括递归地,子聚合的成员或包含
联合),或者
- 一个cha racter类型。

*****问题(1)*****

让我们有两个具有不同标签名称的结构,例如:

struct s1 {int i;};
struct s2 {int i;};

struct s1 * p1;
struct s2 * p2;

编译器可以自由地假设p1和p2指向不同的内存位置而不是别名。
两个具有不同名称的结构被认为是不同的类型。

在标准中,我们多次阅读对象的有效类型的措辞。

这种对象的有效类型。可以是int,double等,但是
也可以是struct。类型,对吧???

我想它也可能是一个数组。类型或联合键入
好吧,这是正确的???


是。

*****问题(2)*****

在随后的小程序中,行printf("%d \ n,* x);"
通常会返回123,
但优化编译器可以返回垃圾而不是123.


不,优化编译器必须仍然输出123。对于这一行。

我的推理是否正确???

另一方面,行printf("%d \ nn,p1-> ; I);"总是按照预期返回999
----

#include< stdio.h>
#include< stdlib.h>

struct s1 {int i;双f; };

int main(void)
结构s1 * p1;
int * x;

p1 = malloc(sizeof (* p1));
p1-> i = 123; //'struct s1'类型的对象包含123

(p1-> i);

printf("%d \ n" ;, *X); //我尝试访问存储在类型''struct s1'
//通过* x的类型为''int''的
对象中的值。
//我认为
标准不允许这样做!


* p1的有效类型是''struct s1''。 s1.i

的有效类型是''int''。 ''x''是一个指向int的指针,你用一个指向int的

指针初始化它。这是完全合法的。


因为int包含值123,并且''x''非常恰当地指向

int,* x必须检索int值123.它什么都不能

else。

* x = 999; //我在* x中存储999,类型为''int''

printf("%d \ n",p1-> i); //我访问存储在* x中的值,其类型为''int''
// by * p1(因为p1-> i是
的快捷方式(* p1) ).i)
//类型为''struct s1'',
//但包含'int'类型的成员。
//我认为这是允许的按标准。

返回0;
}

*****问题(3)*****

标准禁止(如果我没有记错的话)类型为struct A
*的指针访问由类型为struct B *的指针写入的数据,因为它们是不同的类型。

这意味着伪造继承在C中的常见用法如同
--- myfile.c ---

#include< stdio.h>
#include< stdlib.h>

typedef enum {RED,BLUE,GREEN}颜色;

struct Point {int x;
int y ;
};

struct Color_Point {int x;
int y;
颜色;
};

结构Color_Point2 {struct Point point;
颜色;
};
int main(int argc,char * argv [])
{

struct Point * p;

struct Color_Point * my_color_point = malloc(sizeof(struct
Color_Point));
my_color_point-> x = 10;
my_color_point - > y = 20;
my_color_point-> color = GREEN;

p =(struct Point *)my_color_point;

printf(" x:%d,y:%d \ n" ,p-> x,p-> y); //试图访问存储在
中的数据

这是未定义的行为,纯粹而简单。它适用于很多

的实现,但根本不保证。


[snip]

是行p =(struct Point *)my_color_point"还有一个叫什么叫做打字的案例。 ???


类型惩罚不是标准定义的术语,但我会说

通过强制转换分配指针的行为不是类型惩罚。

通过指针访问外部结构类型的成员

是。

*****问题(4)**** *

在标准,第6.5.2.3章中,写成:

为简化工会的使用,我们提出了一项特殊保证:
如果union包含几个共享一个共同初始序列的结构(见下文),
如果union
对象当前包含其中一个结构,则允许它检查公共
其中任何一个的初始部分都可以看到工会的完整
类型的声明。如果相应的成员具有相同的类型(并且对于位字段,相同的宽度),则两个结构共享一个共同的初始序列,用于一个或多个初始成员的序列。

我发现这个陈述完全模糊不清。

让我们有:

struct s1 {int i;};
struct s2 {int i;};

struct s1 * p1;
struct s2 * p2;

编译器可以自由地假设* p1和* p2不要别名。

如果我们在这段代码之前只提出这样的联合声明,那么它就像编译器的标志一样,表示指向struct s1"和指向struct s2的指针(这里,p1和p2)可以别名并指向同一位置。

union p1_p2_alias_flag {struct s1 st1;
struct s2 st2;
};

没有必要使用union p1_p2_alias_flag。用于访问数据,
和p1_p2_alias_flag,st1。和st2只是虚拟名称,在其他任何地方都没有使用。
我的意思是,可以直接使用p1和p2访问数据。


编译器似乎不太可能找到一种方法来防止它一般工作,即使实施者尝试过,但是这样的行为

would not render the compiler non-conforming.


On the other hand, since your structure only contains a single member,

and the first member always begins at the same address as the

structure itself, this particular usage can''t fail.


Still, the behavior is undefined. Which means the language standard

places no requirements on it at all.
Do you agree, everybody ???

***** Question (5) *****

This question is really hard.

Let’’s have this code sniplet:

---------
#include <stdio.h>

int main (void)
{

struct s1 {int i;
};

struct s1 s = {77};

unsigned char* x = (unsigned char*)&s;
printf("%d %d %d %d\n", (int)x[0], (int)x[1], (int)x[2], (int)x[3]);
// Standard says data stored in "struct s1" type can be read by pointer
to "char"

x[0] = 100; // here, I write data in "char" objects !!!
x[1] = 101;
x[2] = 102;
x[3] = 103;


The standard does not say that you can do this. You are assuming that

sizeof(int) is at least 4, and there are implementations where that is

not true. Accessing, let alone writing to, x[1], x[2], or x[3] might

be outside the bounds of the int and the struct, producing undefined

behavior.

printf("%d\n", s.i); // but data stored in "char" objects cannot be
read by pointer to "struct s1" ???

return 0;
}


No, the point is that accessing s.i, an int, after storing data into

that memory using a different object type, is undefined. You might

have created a bit pattern that does not represent a valid value for

the int, called a trap representation.

-----------

For the line "printf("%d %d %d %d\n", (int)x[0], (int)x[1], (int)x[2],
(int)x[3]);", I can rewrite the Standard clause like this:

An object [ here, s of type "struct s1" ] shall have its stored value
accessed only by an lvalue expression that has one of
the following types:
[ blah blah blah ]
- a character type [ in our example, x[0], x[1], x[2], x[3] ]. //
it is our case, so everything is OK so far !


I have worked on a platform where sizeof(int) is 1, and several where

sizeof(int) is 2. I have never worked on a platform where sizeof(int)

is 3, but C allows it. On any of these platforms you would be

invoking undefined behavior.

But what about the line "printf("%d\n", s.i);" ??????


Even assuming that sizeof(int) >= 4 on your implementation, you have

to understand that all types, other than unsigned char, can have trap

representations, that is bit patterns that do not represent a valid

value for the type. By writing arbitrary bit patterns into an int,

you may have created an invalid bit pattern in that int. When you

access that invalid bit pattern as an int, the behavior is undefined.

I read the Standard again and again, but I cannot express how is can
work.
If I rewrite the Standard clause, it gives:

An object [ in our example, x[0], x[1], x[2], and x[3] ] shall have its
stored value accessed only by an lvalue expression that has one of
the following types:
- a type compatible with the effective type of the object, [ this is
not our case ]
- a qualified version of a type compatible with the effective type
of the object, [ still not our case ]
- a type that is the signed or unsigned type corresponding to the
effective type of the object, [ still not our case ]
- a type that is the signed or unsigned type corresponding to a
qualified version of the effective type of the object, [ still not our
case ]
- an aggregate or union type that includes one of the aforementioned
types among its members [ we read through "s" which is of type "struct
s1", but it does not contain a member of type "char" ]
(including, recursively, a member of a subaggregate or contained
union), or
- a character type. [ definitely not our case ]

We see that none of these conditions applies in our case.


The standard provides a specific list of what is allowed. Lists like

this are always exhaustive. That means anything on the list is

specifically undefined.

Where is the flaw in my reasoning ???


There is no flaw in your reasoning, the code produces undefined

behavior.

Does the last "printf" line of this code sniplet work or not ??? and
why ???


There is no question of "work". Whatever it does is just as right or

wrong as anything else that might happen as far as the language is

concerned. That’’s what undefined behavior means. The C standard does

not know or care what happens.

***** Question (6) *****

I often see this code used with socket programming:

struct sockaddr_in my_addr;
...
bind(sockfd, (struct sockaddr *)&my_addr, sizeof(struct sockaddr));

The function bind(...) needs a pointer to "struct sockaddr", but
my_addr is a "struct sockaddr_in".
So, in my opinion, the function bind is not guaranteed to access safely
the content of object my_addr.

Someone knows why this code is not broken ( or if it is ) ???

I try to understand strict aliasing rules that are in the C Standard.
As gcc applies these rules by default, I just want to be sure to
understand fully this issue.

For questions (1), (2) and (3), I think that the answers are all "yes",
but I would be glad to have strong confirmation.

About questions (4), (5) and (6), I really don''t know. Please help ! !
!

--------

The Standard says (
http://www.open-std.org/jtc1/sc22/wg...docs/n1124.pdf chapter 6.5
):

An object shall have its stored value accessed only by an lvalue
expression that has one of
the following types:
- a type compatible with the effective type of the object,
- a qualified version of a type compatible with the effective type
of the object,
- a type that is the signed or unsigned type corresponding to the
effective type of the object,
- a type that is the signed or unsigned type corresponding to a
qualified version of the effective type of the object,
- an aggregate or union type that includes one of the aforementioned
types among its members
(including, recursively, a member of a subaggregate or contained
union), or
- a character type.
***** Question (1) *****

Let''s have two struct having different tag names, like:

struct s1 {int i;};
struct s2 {int i;};

struct s1 *p1;
struct s2 *p2;

The compiler is free to assume that p1 and p2 point to different memory
locations and don''t alias.
Two struct having different names are considered to be different types.

In the standard, we read the wording "effective type of the object"
many times.

This "effective type of the object" may be an "int", "double", etc, but
may also be a "struct" type, right ???

And I suppose it may also be an "array" type or an "union" type as
well, is it correct ???
Yes.
***** Question (2) *****

In the little program that follows, the line "printf("%d\n", *x);"
normally returns 123,
but an optimizing compiler can return garbage instead of 123.
No, an optimizing compiler must still output "123" for this line.
Is my reasoning correct ???

On the other side, the line "printf("%d\n", p1->i);" always returns 999
as expected, right ???

----

#include <stdio.h>
#include <stdlib.h>

struct s1 { int i; double f; };
int main(void)
{
struct s1* p1;
int* x;

p1 = malloc(sizeof(*p1));
p1->i = 123; // object of type ''struct s1'' contains 123

x = &(p1->i);

printf("%d\n", *x); // I try to access a value stored in an
object of type ''struct s1''
// through *x which is of type ''int''.
// I think this is not allowed by the
standard !
The effective type of *p1 is ''struct s1''. The effective type of s1.i
is ''int''. ''x'' is a pointer to int, and you have initialized it with a
pointer to an int. This is perfectly legal.

Since the int contains the value 123, and ''x'' quite properly points to
that int, *x must retrieve the int value 123. It can''t do anything
else.
*x = 999; // I store 999 in *x, which is of type ''int''

printf("%d\n", p1->i); // I access a value stored in *x which is of
type ''int''
// by *p1 ( as p1->i is a shortcut for
(*p1).i )
// which is of type ''struct s1'',
// but contains a member of type ''int''.
// I think this is allowed by the standard.
return 0;
}
***** Question (3) *****

The Standard forbids ( if I am not mistaken ) pointer of type "struct A
*" to access data written by a pointer of type "struct B *", as the are
different types.

This means that the common usage of faking inheritance in C like in
this code sniplet is now utterly wrong, is it correct ???
--- myfile.c ---

#include <stdio.h>
#include <stdlib.h>

typedef enum { RED, BLUE, GREEN } Color;

struct Point { int x;
int y;
};

struct Color_Point { int x;
int y;
Color color;
};

struct Color_Point2{ struct Point point;
Color color;
};

int main(int argc, char* argv[])
{

struct Point* p;

struct Color_Point* my_color_point = malloc(sizeof(struct
Color_Point));
my_color_point->x = 10;
my_color_point->y = 20;
my_color_point->color = GREEN;

p = (struct Point*)my_color_point;

printf("x:%d, y:%d\n", p->x, p->y); // trying to access data stored in
This is undefined behavior, pure and simple. It works on many
implementations, but is not guaranteed at all.

[snip]
Is the line "p = (struct Point*)my_color_point" also a case of what is
called "type-punning" ???
Type punning is not a term defined by the standard, but I would say
that the act of assigning the pointer via a cast is not type punning.
Accessing a member of the foreign structure type through the pointer
is.
***** Question (4) *****

In the Standard, chapter 6.5.2.3, it is written:

One special guarantee is made in order to simplify the use of unions:
if a union contains
several structures that share a common initial sequence (see below),
and if the union
object currently contains one of these structures, it is permitted to
inspect the common
initial part of any of them anywhere that a declaration of the complete
type of the union is
visible. Two structures share a common initial sequence if
corresponding members have
compatible types (and, for bit-fields, the same widths) for a sequence
of one or more
initial members.

I find this statement completely obscure.

Let''s have:

struct s1 {int i;};
struct s2 {int i;};

struct s1 *p1;
struct s2 *p2;

A compiler is free to assume that *p1 and *p2 don''t alias.

If we just put a union declaration like this before this code, then it
acts like a flag to the compiler, indicating that pointers to "struct
s1" and pointers to "struct s2" ( here, p1 and p2 ) may alias and point
to the same location.

union p1_p2_alias_flag { struct s1 st1;
struct s2 st2;
};

There is no need to use "union p1_p2_alias_flag" for accessing data,
and "p1_p2_alias_flag", "st1" and "st2" are just dummy names, not used
anywhere else.
I mean, it is possible to access data using directly p1 and p2.
It seems unlikely that a compiler could find a way to prevent it from
working in general, even if the implementer tried, but such behavior
would not render the compiler non-conforming.

On the other hand, since your structure only contains a single member,
and the first member always begins at the same address as the
structure itself, this particular usage can''t fail.

Still, the behavior is undefined. Which means the language standard
places no requirements on it at all.
Do you agree, everybody ???
***** Question (5) *****

This question is really hard.

Let''s have this code sniplet:

---------
#include <stdio.h>

int main (void)
{

struct s1 {int i;
};

struct s1 s = {77};

unsigned char* x = (unsigned char*)&s;
printf("%d %d %d %d\n", (int)x[0], (int)x[1], (int)x[2], (int)x[3]);
// Standard says data stored in "struct s1" type can be read by pointer
to "char"

x[0] = 100; // here, I write data in "char" objects !!!
x[1] = 101;
x[2] = 102;
x[3] = 103;
The standard does not say that you can do this. You are assuming that
sizeof(int) is at least 4, and there are implementations where that is
not true. Accessing, let alone writing to, x[1], x[2], or x[3] might
be outside the bounds of the int and the struct, producing undefined
behavior.
printf("%d\n", s.i); // but data stored in "char" objects cannot be
read by pointer to "struct s1" ???

return 0;
}
No, the point is that accessing s.i, an int, after storing data into
that memory using a different object type, is undefined. You might
have created a bit pattern that does not represent a valid value for
the int, called a trap representation.
-----------

For the line "printf("%d %d %d %d\n", (int)x[0], (int)x[1], (int)x[2],
(int)x[3]);", I can rewrite the Standard clause like this:

An object [ here, s of type "struct s1" ] shall have its stored value
accessed only by an lvalue expression that has one of
the following types:
[ blah blah blah ]
- a character type [ in our example, x[0], x[1], x[2], x[3] ]. //
it is our case, so everything is OK so far !
I have worked on a platform where sizeof(int) is 1, and several where
sizeof(int) is 2. I have never worked on a platform where sizeof(int)
is 3, but C allows it. On any of these platforms you would be
invoking undefined behavior.
But what about the line "printf("%d\n", s.i);" ??????
Even assuming that sizeof(int) >= 4 on your implementation, you have
to understand that all types, other than unsigned char, can have trap
representations, that is bit patterns that do not represent a valid
value for the type. By writing arbitrary bit patterns into an int,
you may have created an invalid bit pattern in that int. When you
access that invalid bit pattern as an int, the behavior is undefined.
I read the Standard again and again, but I cannot express how is can
work.
If I rewrite the Standard clause, it gives:

An object [ in our example, x[0], x[1], x[2], and x[3] ] shall have its
stored value accessed only by an lvalue expression that has one of
the following types:
- a type compatible with the effective type of the object, [ this is
not our case ]
- a qualified version of a type compatible with the effective type
of the object, [ still not our case ]
- a type that is the signed or unsigned type corresponding to the
effective type of the object, [ still not our case ]
- a type that is the signed or unsigned type corresponding to a
qualified version of the effective type of the object, [ still not our
case ]
- an aggregate or union type that includes one of the aforementioned
types among its members [ we read through "s" which is of type "struct
s1", but it does not contain a member of type "char" ]
(including, recursively, a member of a subaggregate or contained
union), or
- a character type. [ definitely not our case ]

We see that none of these conditions applies in our case.
The standard provides a specific list of what is allowed. Lists like
this are always exhaustive. That means anything on the list is
specifically undefined.
Where is the flaw in my reasoning ???
There is no flaw in your reasoning, the code produces undefined
behavior.
Does the last "printf" line of this code sniplet work or not ??? and
why ???
There is no question of "work". Whatever it does is just as right or
wrong as anything else that might happen as far as the language is
concerned. That''s what undefined behavior means. The C standard does
not know or care what happens.
***** Question (6) *****

I often see this code used with socket programming:

struct sockaddr_in my_addr;
...
bind(sockfd, (struct sockaddr *)&my_addr, sizeof(struct sockaddr));

The function bind(...) needs a pointer to "struct sockaddr", but
my_addr is a "struct sockaddr_in".
So, in my opinion, the function bind is not guaranteed to access safely
the content of object my_addr.

Someone knows why this code is not broken ( or if it is ) ???




That depends on the definition of ’’struct sockaddr_in’’. If its first

member is a ’’struct sockaddr’’, the code is legal and well defined

because a pointer to a structure can always be converted to a pointer

to its first member. If not, then the code produces undefined

behavior if the called function actually uses the pointer to access

members of a ’’struct sockaddr’’.


You use terms like "broken" and "work", which do not really apply as

far as undefined behavior in C is concerned. They are subjective

terms at best. Code is "broken" if it does not do what you want, you

consider it to "work" if it does. If it produces undefined behavior,

it may "work" on one compiler but be "broken" on another, and both

compilers can be standard conforming.


--

Jack Klein

Home: http://JK-Technology.Com

FAQs for

comp.lang.c http://www.eskimo.com/~scs/C-faq/top.html

comp.lang.c++ http://www.parashift.com/c++-faq-lite/

alt.comp.lang.learn.c-c++
http://www.contrib.andrew.cmu.edu/~a...FAQ-acllc.html


Christian Bau <ch***********@cbau.freeserve.co.uk> wrote:
Christian Bau <ch***********@cbau.freeserve.co.uk> wrote:
In article <11**********************@f14g2000cwb.googlegroups .com>,
ni************@genevoise.ch wrote:

[snip]
In article <11**********************@f14g2000cwb.googlegroups .com>,
ni************@genevoise.ch wrote:
[snip]
***** Question (5) *****

This question is really hard.

Let’’s have this code sniplet:

---------
#include <stdio.h>

int main (void)
{

struct s1 {int i;
};

struct s1 s = {77};

unsigned char* x = (unsigned char*)&s;
printf("%d %d %d %d\n", (int)x[0], (int)x[1], (int)x[2], (int)x[3]);
// Standard says data stored in "struct s1" type can be read by pointer
to "char"
***** Question (5) *****

This question is really hard.

Let''s have this code sniplet:

---------
#include <stdio.h>

int main (void)
{

struct s1 {int i;
};

struct s1 s = {77};

unsigned char* x = (unsigned char*)&s;
printf("%d %d %d %d\n", (int)x[0], (int)x[1], (int)x[2], (int)x[3]);
// Standard says data stored in "struct s1" type can be read by pointer
to "char"



That is if sizeof (int) >= 4, which is nowhere guaranteed.



That is if sizeof (int) >= 4, which is nowhere guaranteed.

x[0] = 100; // here, I write data in "char" objects !!!
x[1] = 101;
x[2] = 102;
x[3] = 103;

Let’’s suppose that we copy value from another int:

int i = 42;

unsigned char *y = (void*)&i;

assert(sizeof(int) == 4);

x[0] = y[0];

//...etc. printf("%d\n", s.i); // but data stored in "char" objects cannot be
read by pointer to "struct s1" ???


Storing values through character lvalues did not change the effective

type of the struct, or it’’s member, therefore it’’s okay (compiler must

reread the value from memory).


Effective type for declared objects is always the declared type.

Effective type for allocated objects is the last imprinted by

storing a value, by copying (memcpy, memmove, char array), or, if

none, is the type of the lvalue it is accessed with.

Assuming that sizeof (int) == 4, you have changed exactly every bit in
the representation of x. If the representation is not a trap
representation, you are fine. And it is even ok if for example the
result after storing three bytes, combined with the last remaining byte
of the number 77 were a trap representation, because you never access
that value.
x[0] = 100; // here, I write data in "char" objects !!!
x[1] = 101;
x[2] = 102;
x[3] = 103;
Let''s suppose that we copy value from another int:
int i = 42;
unsigned char *y = (void*)&i;
assert(sizeof(int) == 4);
x[0] = y[0];
//...etc. printf("%d\n", s.i); // but data stored in "char" objects cannot be
read by pointer to "struct s1" ???
Storing values through character lvalues did not change the effective
type of the struct, or it''s member, therefore it''s okay (compiler must
reread the value from memory).

Effective type for declared objects is always the declared type.
Effective type for allocated objects is the last imprinted by
storing a value, by copying (memcpy, memmove, char array), or, if
none, is the type of the lvalue it is accessed with.
Assuming that sizeof (int) == 4, you have changed exactly every bit in
the representation of x. If the representation is not a trap
representation, you are fine. And it is even ok if for example the
result after storing three bytes, combined with the last remaining byte
of the number 77 were a trap representation, because you never access
that value.




(all agreed)


[snip] For the line "printf("%d %d %d %d\n", (int)x[0], (int)x[1], (int)x[2],
(int)x[3]);", I can rewrite the Standard clause like this:

An object [ here, s of type "struct s1" ] shall have its stored value
accessed only by an lvalue expression that has one of
the following types:
[ blah blah blah ]
- a character type [ in our example, x[0], x[1], x[2], x[3] ]. //
it is our case, so everything is OK so far !

But what about the line "printf("%d\n", s.i);" ??????
I read the Standard again and again, but I cannot express how is can
work.



(all agreed)

[snip] For the line "printf("%d %d %d %d\n", (int)x[0], (int)x[1], (int)x[2],
(int)x[3]);", I can rewrite the Standard clause like this:

An object [ here, s of type "struct s1" ] shall have its stored value
accessed only by an lvalue expression that has one of
the following types:
[ blah blah blah ]
- a character type [ in our example, x[0], x[1], x[2], x[3] ]. //
it is our case, so everything is OK so far !
But what about the line "printf("%d\n", s.i);" ??????
I read the Standard again and again, but I cannot express how is can
work.




It means this: struct s1 object can be legally accessed with a character

lvalue (including writing data to the struct). Since it’’s legal,

the compiler must take it into consideration when later accessing

struct s1. Either it can prove that character lvalues did not refer

to the struct object, or it must re-read the struct value from memory.


This is not the case with other types:

assert(sizeof(int) == sizeof(short))

int i = 42;

short *ps = &i; //assume that alignment is the same

*ps = 54; //this access is UB; since it is not legal to access int object

//with short lvalue, compiler need not assume that object `i’’

//was actually changed

printf("%d\n", i); //may print cached value 42

//(the Std says it can do or not do virtually anything)


For another example: when a value is stored through `short’’ lvalue,

the compiler need not assume that `struct s1’’ object was changed,

because `struct s1’’ does not contain a `short’’ member.


--

Stan Tobias

mailx `echo si***@FamOuS.BedBuG.pAlS.INVALID | sed s / [[:upper:]] // g`



It means this: struct s1 object can be legally accessed with a character
lvalue (including writing data to the struct). Since it''s legal,
the compiler must take it into consideration when later accessing
struct s1. Either it can prove that character lvalues did not refer
to the struct object, or it must re-read the struct value from memory.

This is not the case with other types:
assert(sizeof(int) == sizeof(short))
int i = 42;
short *ps = &i; //assume that alignment is the same
*ps = 54; //this access is UB; since it is not legal to access int object
//with short lvalue, compiler need not assume that object `i''
//was actually changed
printf("%d\n", i); //may print cached value 42
//(the Std says it can do or not do virtually anything)

For another example: when a value is stored through `short'' lvalue,
the compiler need not assume that `struct s1'' object was changed,
because `struct s1'' does not contain a `short'' member.

--
Stan Tobias
mailx `echo si***@FamOuS.BedBuG.pAlS.INVALID | sed s/[[:upper:]]//g`


这篇关于ISO C中的严格别名规则,有人理解它们吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆