UB上使用非字符类型的阅读对象使用的字符类型最后写入时 [英] UB on reading object using non-character type when last written using character type

查看:147
本文介绍了UB上使用非字符类型的阅读对象使用的字符类型最后写入时的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设 unsigned int类型有没有陷阱再presentations,做一个或以下两个标记的语句(A)和(B)的挑衅未定义的行为,为什么还是为什么不和(特别是如果你觉得其中有一个是明确的,但对方不是),你认为在标准的缺陷?我在C标准(即C2011)当前版本的主要兴趣,但如果这是在旧版本的标准不同,或者在C ++中,我也想知道。

Assuming unsigned int has no trap representations, do either or both of the statements marked (A) and (B) below provoke undefined behavior, why or why not, and (especially if you think one of them is well-defined but the other isn't), do you consider that a defect in the standard? I am primarily interested in the current version of the C standard (i.e. C2011), but if this is different in older versions of the standard, or in C++, I would also like to know about that.

_Alignas 这个程序是用来消除UB的任何问题,由于调整不够。我在跨pretation讨论的规则,不过,什么也不说有关对齐。)

(_Alignas is used in this program to eliminate any question of UB due to inadequate alignment. The rules I discuss in my interpretation, though, say nothing about alignment.)

#include <stdlib.h>
#include <string.h>

int main(void)
{
    unsigned int v1, v2;
    unsigned char _Alignas(unsigned int) b1[sizeof(unsigned int)];
    unsigned char *b2 = malloc(sizeof(unsigned int));

    if (!b2) return 1;

    memset(b1, 0x55, sizeof(unsigned int));
    memset(b2, 0x55, sizeof(unsigned int));

    v1 = *(unsigned int *)b1; /* (A) */
    v2 = *(unsigned int *)b2; /* (B) */

    return !(v1 == v2);
}

C2011的我间pretation是(A)引发未定义的行为而(B)是定义良好(存储一个未确定的值到 V2 ),这是因为:


  • memset的定义(§7.24.6.1)写的第一个参数是,如果通过与字符类型,这是允许为左值 B1 B2 。

  • memset is defined (§7.24.6.1) to write to its first argument as-if through an lvalue with character type, which is allowed for both b1 and b2 per the special case at the bottom of §6.5p7.

对象 B1 有一个声明的类型, unsigned char型[N] 。因此,其有效类型的访问也是每6.5p6 unsigned char型[N] 。声明(A)读 B1 通过一个左值前pression其类型是 unsigned int类型,这是不有效类型的 B1 ,也没有任何6.5p7其他异常,所以行为是不确定的。

The object b1 has a declared type, unsigned char[n]. Therefore, its effective type for accesses is also unsigned char[n] per 6.5p6. Statement (A) reads b1 via an lvalue expression whose type is unsigned int, which is not the effective type of b1 nor any of the other exceptions in 6.5p7, so the behavior is undefined.

指向的对象,由 B2 并没有声明的类型。 (由 memset的)存储到它的价值是(AS-IF)通过与字符类型左值,所以6.5p6的第二种情况下不适用。该值是不是的复制的任何地方,所以6.5p6的第三种情况没有任何适用。因此,有效的对象的类型是用于接入左值的类型,这是无符号整型,和6.5p7规则得到满足。

The object pointed-to by b2 has no declared type. The value stored into it (by memset) was (as-if) through an lvalue with character type, so the second case of 6.5p6 does not apply. The value was not copied from anywhere, so the third case of 6.5p6 does not apply either. Therefore, the effective type of the object is the type of the lvalue used for the access, which is unsigned int, and the rules of 6.5p7 are satisfied.

最后,每6.2.6.1,假设 unsigned int类型有没有陷阱再presentations,在 memset的运行也产生了一些不确定的 unsigned int类型价值的重新presentation每个 B1 B2 。因此,如果既没有(A),也没有(B)引发未定义的行为,那么在 V1 V2 是实际值不确定的,但他们是平等的。

Finally, per 6.2.6.1, assuming unsigned int has no trap representations, the memset operation has created the representation of some unspecified unsigned int value in each of b1 and b2. Therefore, if neither (A) nor (B) provokes undefined behavior, then the actual values in v1 and v2 are unspecified but they are equal.

评论:

的基于类型的混叠规则(即,6.5p7),允许一个对象与任何有效类型的通过与字符类型左值被访问,而不是相反的不对称性,是混乱的连续源。 6.5p6的第二种情况似乎已经专门添加到prevent它是未定义的行为来读取或 memset的初始化的值(为此事,释放calloc ),但是,因为它仅适用于不声明类型的对象,本身就是混乱的另一个来源。

The asymmetry of the "type-based aliasing" rules (that is, 6.5p7), permitting an object with any effective type to be accessed by an lvalue with character type, but not vice versa, is a continual source of confusion. The second case of 6.5p6 seems to have been added specifically to prevent its being undefined behavior to read a value initialized by memset (or, for that matter, calloc) but, because it only applies to objects with no declared type, is itself an additional source of confusion.

推荐答案

在一个肤浅的检查,我也是这样想的(A为UB,B是罚款),并能提供为什么这应该是一个具体的理由所以(编辑之前,包括 _Alignas()):对齐

On a superficial examination, I'd agree with your assessment (A is UB, B is fine), and can offer a concrete rationale for why that should be so (prior to the edit to include _Alignas()): Alignment.

的char [] 堆栈可以在任何地址开始,不管是对于的有效排列上unsigned int类型或没有。相比之下,的malloc()需返还内存满足相关平台上的任何原始类型的严格对齐要求。

The char[] on the stack can start at any address, whether that's a valid alignment for an unsigned int or not. In contrast, malloc() is required to return memory meeting the strictest alignment requirements of any native type on the platform in question.

标准显然不希望在的char [] 并处对齐要求超出字符,所以它必须离开这类型punned访问具有潜在不确定的。

The standard obviously doesn't want to impose alignment requirements on char[] beyond those of char, so it has to leave type-punned access to it as potentially undefined.

这篇关于UB上使用非字符类型的阅读对象使用的字符类型最后写入时的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆