通过来自其他结构成员的偏移量指针访问结构成员是否合法? [英] Is it legal to access struct members via offset pointers from other struct members?

查看:126
本文介绍了通过来自其他结构成员的偏移量指针访问结构成员是否合法?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在这两个示例中,通过从其他成员偏移指针来访问结构的成员是否会导致未定义/未指定/实现定义的行为?

In these two examples, does accessing members of the struct by offsetting pointers from other members result in Undefined / Unspecified / Implementation Defined Behavior?

struct {
  int a;
  int b;
} foo1 = {0, 0};

(&foo1.a)[1] = 1;
printf("%d", foo1.b);


struct {
  int arr[1];
  int b;
} foo2 = {{0}, 0};

foo2.arr[1] = 1;
printf("%d", foo2.b);

C11§6.7.2.1的第14段似乎表明这应该由实现定义:

Paragraph 14 of C11 § 6.7.2.1 seems to indicate that this should be implementation-defined:

结构或联合对象的每个非位字段成员都以适合于其类型的实现定义的方式对齐.

Each non-bit-field member of a structure or union object is aligned in an implementation-defined manner appropriate to its type.

,然后继续说:

结构对象中可能存在未命名的填充,但在其开头没有.

There may be unnamed padding within a structure object, but not at its beginning.

但是,类似以下代码的代码似乎很常见:

However, code like the following appears to be fairly common:

union {
  int arr[2];
  struct {
    int a;
    int b;
  };
} foo3 = {{0, 0}};

foo3.arr[1] = 1;
printf("%d", foo3.b);

(&foo3.a)[1] = 2; // appears to be illegal despite foo3.arr == &foo3.a
printf("%d", foo3.b);

该标准似乎可以保证foo3.arr&foo3.a相同,并且以一种方式引用是合法的,而以另一种方式引用是没有意义的,但同样地,添加数组的外部联合会突然使(&foo3.a)[1]合法.

The standard appears to guarantee that foo3.arr is the same as &foo3.a, and it doesn't make sense that referring to it one way is legal and the other not, but equally it doesn't make sense that adding the outer union with the array should suddenly make (&foo3.a)[1] legal.

因此,我认为第一个示例的理由也必须是合法的:

My reasoning for thinking the first examples must also therefore be legal:

    保证
  1. foo3.arr&foo.a
  2. 相同
  3. foo3.arr + 1&foo3.b指向相同的内存位置
  4. 因此,
  5. &foo3.a + 1&foo3.b必须指向相同的存储位置(从1和2开始)
  6. 结构布局必须一致,因此&foo1.a&foo1.b的布局应与&foo3.a&foo3.b
  7. 完全相同 因此,
  8. &foo1.a + 1&foo1.b必须指向相同的存储位置(从3和4)
  1. foo3.arr is guaranteed to be the same as &foo.a
  2. foo3.arr + 1 and &foo3.b point to the same memory location
  3. &foo3.a + 1 and &foo3.b must therefore point to the same memory location (from 1 and 2)
  4. struct layouts are required to be consistent, so &foo1.a and &foo1.b should be laid out exactly the same as &foo3.a and &foo3.b
  5. &foo1.a + 1 and &foo1.b must therefore point to the same memory location (from 3 and 4)

我遇到了一些外部消息来源,这些证据表明foo3.arr[1](&foo3.a)[1]示例都是非法的,但是我无法在标准中找到具体的陈述. 即使它们都是非法的,也可以使用灵活的数组指针构造相同的场景,据我所知,确实具有标准定义的行为.

I've come across some outside sources that suggest that both the foo3.arr[1] and (&foo3.a)[1] examples are illegal, however I haven't been able to find a concrete statement in the standard that would make it so. Even if they were both illegal though, it's also possible to construct the same scenario with flexible array pointers which, as far as I can tell, does have standard-defined behavior.

union {
  struct {
    int x;
    int arr[];
  };
  struct {
    int y;
    int a;
    int b;
  };
} foo4;

原始应用程序正在考虑是否严格按照标准定义从一个struct字段到另一个struct字段的缓冲区溢出:

The original application is considering whether or not a buffer overflow from one struct field into another is strictly speaking defined by the standard:

struct {
  char buffer[8];
  char overflow[8];
} buf;
strcpy(buf.buffer, "Hello world!");
println(buf.overflow);

我希望它能在几乎任何现实世界的编译器上输出"rld!",但是该行为是否由标准保证?或者是未定义或实现定义的行为?

I would expect this to output "rld!" on nearly any real-world compiler, but is this behavior guaranteed by the standard, or is it an undefined or implementation-defined behavior?

推荐答案

简介:该领域的标准不足,关于该主题的争论已有数十年的历史,并且没有严格的别名令人信服的解决方案或解决方案.

Introduction: The standard is inadequate in this area, and there is decades of history of argument on this topic and strict aliasing with no convincing resolution or proposal to fix.

此答案反映了我的观点,而不是对标准的任何强加.

This answer reflects my view rather than any imposition of the Standard.

首先:大家普遍认为,第一个代码示例中的代码是未定义的行为,因为通过直接指针算法访问了数组的边界.

Firstly: it's generally agreed that the code in your first code sample is undefined behaviour due to accessing outside the bounds of an array via direct pointer arithmetic.

规则是C11 6.5.6/8.它说从一个指针开始的索引必须保留在数组对象"内(或末尾).它没有说哪个数组对象,但是通常都同意在int *p = &foo.a;的情况下,数组对象"是foo.a,而不是foo.a是一个更大的对象.子对象.

The rule is C11 6.5.6/8 . It says that indexing from a pointer must remain within "the array object" (or one past the end). It doesn't say which array object but it is generally agreed that in the case int *p = &foo.a; then "the array object" is foo.a, and not any larger object of which foo.a is a subobject.

相关链接: 两个.

第二:通常都认为您的两个union示例都是正确的.该标准明确规定,工会的任何成员都可以阅读;以及相关存储位置的任何内容均被解释为正在读取的联合成员的类型.

Secondly: it's generally agreed that both of your union examples are correct. The standard explicitly says that any member of a union may be read; and whatever the contents of the relevant memory location are are interpreted as the type of the union member being read.

您建议union是正确的,也意味着第一个代码也应该正确,但事实并非如此.问题不在于指定读取的内存位置;问题在于我们如何到达指定该内存位置的表达式.

You suggest that the union being correct implies that the first code should be correct too, but it does not. The issue is not with specifying the memory location read; the issue is with how we arrived at the expression specifying that memory location.

即使我们知道&foo.a + 1&foo.b是相同的内存地址,也可以通过第二个访问int有效,而不能通过第一个访问int有效.

Even though we know that &foo.a + 1 and &foo.b are the same memory address, it's valid to access an int through the second and not valid to access an int through the first.

通常同意,您可以通过不破坏6.5.6/8规则的其他方式来计算int的地址,例如:

It's generally agreed that you can access the int by computing its address in other ways that don't break the 6.5.6/8 rule, e.g.:

((int *)((char *)&foo + offsetof(foo, b))[0]

((int *)((uintptr_t)&foo.a + sizeof(int)))[0]

相关链接:一个两个

>是否有效不是.有人说它与您的第一个代码基本相同,因为该标准说指向适当转换后的对象的指针,指向该元素的第一个对象".其他人则说它与我上面的(char *)示例基本相同,因为它遵循指针转换的规范.甚至有人声称这是严格的别名冲突,因为它将结构别名为数组.

It's not generally agreed on whether ((int *)&foo)[1] is valid. Some say it's basically the same as your first code, since the standard says "a pointer to an object, suitably converted, points to the element's first object". Others say it's basically the same as my (char *) example above because it follows from the specification of pointer casting. A few even claim it's a strict aliasing violation because it aliases a struct as an array.

可能相关的是 N2090-指针出处建议.这不会直接解决该问题,也不会建议废除6.5.6/8.

Maybe relevant is N2090 - Pointer provenance proposal. This does not directly address the issue, and doesn't propose a repeal of 6.5.6/8.

这篇关于通过来自其他结构成员的偏移量指针访问结构成员是否合法?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆