可以的memcpy用于类型双关? [英] Can memcpy be used for type punning?

查看:87
本文介绍了可以的memcpy用于类型双关?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

这是从C11标准报价:


  

6.5防爆pressions


  
  

...


  
  

6 有效类型的的用于向它的存储值的访问的对象的是对象的声明的类型,如果有的话。如果值被存储到一个对象具有带类型不是字符类型,那么左值的类型成为了访问和不修改的后续访问的有效对象的类型通过左值没有声明的类型存储的值。如果值被复制到一个对象具有不使用任何类型声明的memcpy memmove与,或者被复制为数组字符类型,则有效类型的访问和对不修改该值的后续访问的修改的对象的是有效的类型从该值被复制,如果有对象的。对于所有其他的访问不具有声明类型的对象中,有效的对象的类型是简单地用于接入左值的类型。


  
  

7的对象应具有其存储的价值只能由有以下几种类型之一的左值前pression访问:


  
  

- 与有效的对象的类型,

兼容的类型
  
  

- 合格的版本的类型与有效的对象的类型兼容,


  
  

- 一型,它是符号或对应于有效的对象的类型无符号类型,


  
  

- 一型,它是符号或对应于有效的对象的类型,

的一个合格的版本无符号类型
  
  

- 包括其成员之间的上述类型之一(包括递归,一个子聚集的成员或包含的联合)聚合或联合类型,或


  
  

- 字符类型


这是否意味着的memcpy 不能用于类型双关语是这样的:

 双D = 1234.5678;
uint64_t中位;
的memcpy(安培;位,和D,sizeof的位);
的printf(%的克重presentation为08%PRIX64\\ n,D位);

为什么它不给相同的输出为:

 工会{双D; uint64_t中我; } U;
u.d = 1234.5678;
的printf(%的克重presentation为08%PRIX64\\ n,D,u.i);

如果我用我的版本的的memcpy 使用字符类型:

 无效* my_memcpy(void *的DST,常量无效* SRC,为size_t N){
    无符号字符* D = DST;
    const的无符号字符* S = SRC;
    用于(为size_t我= 0; I< N;我++){D [i] = S [I] }
    返回DST;
}

编辑: EOF评论说的第6款关于部分的memcpy()不会在这种情况下适用,因为 uint64_t中位有一个声明的类型。的我同意,但不幸的是,这并不能帮助回答问题是否的memcpy 可用于类型双关,它只是使第6段无关,以评估上述实施例的有效性。

下面这里是用的memcpy ,我相信会受第6款覆盖式双关的另一种尝试:

 双D = 1234.5678;
无效* P =的malloc(sizeof的(双));
如果(P!= NULL){
    uint64_t中* pbits =的memcpy(P,和D,的sizeof(双));
    uint64_t中位= * pbits;
    的printf(%的克重presentation为08%PRIX64\\ n,D位);
}

假设的sizeof(双)==的sizeof(uint64_t中),请问上述code已根据第6和第7定义的行为?

编辑:一些答案​​指向了未定义行为读取陷阱重新presentation未来的潜力。这是不相关的C标准明确排除了这种可能性:


  

7.20.1.1精确宽的整数类型


  
  

1 typedef名 INT N _t 表示与宽度的符号整型<青霉> N 的,没有填充比特,和一个二的补码重新presentation。因此,中int8_t 表示这样的符号整型恰好与8位的宽度。


  
  

2 typedef名 UINT N _t 表示与宽度的无符号整型的 N 的,没有填充位。因此, uint24_t 表示这样的无符号整型恰好与24位的宽度。


  
  

这些类型是可选的。但是,如果一个实现提供了整数类型为8的宽度,16,32,或64位,无填充比特,以及(用于签名的类型)具有一个二的补码重新presentation,应当定义相应typedef名称


键入 uint64_t中恰好有64位的价值和没有填充位,因此不能有任何陷阱重新presentations。


解决方案

有两种情况考虑:的memcpy() ING成的都有一个对象的声明类型和的memcpy() ING成一个对象,没有。

在第二种情况下,

 双D = 1234.5678;
无效* P =的malloc(sizeof的(双));
断言(P);
uint64_t中* pbits =的memcpy(P,和D,的sizeof(双));
uint64_t中位= * PI;
的printf(%的克重presentation为08%PRIX64\\ n,D位);

行为确实是不确定的,因为有效对象的类型由 P 指出,将成为双击,和访问有效类型的对象双击虽然类型的左值 uint64_t中是不确定的。

在另一方面,

 双D = 1234.5678;
uint64_t中位;
的memcpy(安培;位,和D,sizeof的位);
的printf(%的克重presentation为08%PRIX64\\ n,D位);

是的的定义。标准n1570 C11草案:


  

7.24.1字符串函数约定


  
  

3


  
  

    

有关在本节的所有功能,每个角色应是间preTED,如果它有型
    unsigned char型(因此每一个可能的对象,再presentation是
    有效,并具有一个不同的值)。


  


  

6.5防爆pressions


  
  

7


  
  

    

对象应具有其存储的值只能由具有之一的左值前pression访问
    以下类型:88)
     - 一个类型与有效的对象的类型兼容,


    
    

- 合格的版本的类型与有效的对象的类型兼容,


    
    

- 一型,它是符号或对应于有效的对象的类型无符号类型,


    
    

- 一个类型是签名或相应的合格的版本无符号类型
    有效类型的对象,


    
    

- 包括上述类型之一的聚集或联合类型
    其成员(包括递归,一个子聚集的成员或包含的联合)中,或


    
    

- 字符类型


  

所以的memcpy()本身是明确的。

正弦 uint64_t中位具有声明的类型的,它保留了它的类型,即使它的对象重新presentation是从<$复制C $ C>双击。

由于chqrlie指出, uint64_t中不能有陷阱重新presentations,所以访问的memcpy()是的的定义,提供了的sizeof(uint64_t中)==的sizeof(双) 。然而,的将是依赖于实现的(例如,由于字节序)。

结论的memcpy()可以的用于类型双关,前提是的目的地的memcpy()确实有声明的类型,即不被分配 [M / C /重]页头()或等价的。

This is a quote from the C11 Standard:

6.5 Expressions

...

6 The effective type of an object for an access to its stored value is the declared type of the object, if any. If a value is stored into an object having no declared type through an lvalue having a type that is not a character type, then the type of the lvalue becomes the effective type of the object for that access and for subsequent accesses that do not modify the stored value. If a value is copied into an object having no declared type using memcpy or memmove, or is copied as an array of character type, then the effective type of the modified object for that access and for subsequent accesses that do not modify the value is the effective type of the object from which the value is copied, if it has one. For all other accesses to an object having no declared type, the effective type of the object is simply the type of the lvalue used for the access.

7 An object shall have its stored value accessed only by an lvalue expression that has one of the following types:

— a type compatible with the effective type of the object,

— a qualified version of a type compatible with the effective type of the object,

— a type that is the signed or unsigned type corresponding to the effective type of the object,

— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,

— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or

— a character type.

Does this imply that memcpy cannot be used for type punning this way:

double d = 1234.5678;
uint64_t bits;
memcpy(&bits, &d, sizeof bits);
printf("the representation of %g is %08"PRIX64"\n", d, bits);

Why would it not give the same output as:

union { double d; uint64_t i; } u;
u.d = 1234.5678;
printf("the representation of %g is %08"PRIX64"\n", d, u.i);

What if I use my version of memcpy using character types:

void *my_memcpy(void *dst, const void *src, size_t n) {
    unsigned char *d = dst;
    const unsigned char *s = src;
    for (size_t i = 0; i < n; i++) { d[i] = s[i]; }
    return dst;
}

EDIT: EOF commented that The part about memcpy() in paragraph 6 doesn't apply in this situation, since uint64_t bits has a declared type. I agree, but, unfortunately, this does not help answer the question whether memcpy can be used for type punning, it just makes paragraph 6 irrelevant to assess the validity of the above examples.

Here here is another attempt at type punning with memcpy that I believe would be covered by paragraph 6:

double d = 1234.5678;
void *p = malloc(sizeof(double));
if (p != NULL) {
    uint64_t *pbits = memcpy(p, &d, sizeof(double));
    uint64_t bits = *pbits;
    printf("the representation of %g is %08"PRIX64"\n", d, bits);
}

Assuming sizeof(double) == sizeof(uint64_t), Does the above code have defined behavior under paragraph 6 and 7?

EDIT: Some answers point to the potential for undefined behavior coming from reading a trap representation. This is not relevant as the C Standard explicitly excludes this possibility:

7.20.1.1 Exact-width integer types

1 The typedef name intN_t designates a signed integer type with width N, no padding bits, and a two’s complement representation. Thus, int8_t denotes such a signed integer type with a width of exactly 8 bits.

2 The typedef name uintN_t designates an unsigned integer type with width N and no padding bits. Thus, uint24_t denotes such an unsigned integer type with a width of exactly 24 bits.

These types are optional. However, if an implementation provides integer types with widths of 8, 16, 32, or 64 bits, no padding bits, and (for the signed types) that have a two’s complement representation, it shall define the corresponding typedef names.

Type uint64_t has exactly 64 value bits and no padding bits, thus there cannot be any trap representations.

解决方案

There are two cases to consider: memcpy()ing into an object that has a declared type, and memcpy()ing into an object that does not.

In the second case,

double d = 1234.5678;
void *p = malloc(sizeof(double));
assert(p);
uint64_t *pbits = memcpy(p, &d, sizeof(double));
uint64_t bits = *pi;
printf("the representation of %g is %08"PRIX64"\n", d, bits);

The behavior is indeed undefined, since the effective type of the object pointed to by p will become double, and accessing an object of effective type double though an lvalue of type uint64_t is undefined.

On the other hand,

double d = 1234.5678;
uint64_t bits;
memcpy(&bits, &d, sizeof bits);
printf("the representation of %g is %08"PRIX64"\n", d, bits);

is not undefined. C11 draft standard n1570:

7.24.1 String function conventions

3

For all functions in this subclause, each character shall be interpreted as if it had the type unsigned char (and therefore every possible object representation is valid and has a different value).

And

6.5 Expressions

7

An object shall have its stored value accessed only by an lvalue expression that has one of the following types: 88) - a type compatible with the effective type of the object,

— a qualified version of a type compatible with the effective type of the object,

— a type that is the signed or unsigned type corresponding to the effective type of the object,

— a type that is the signed or unsigned type corresponding to a qualified version of the effective type of the object,

— an aggregate or union type that includes one of the aforementioned types among its members (including, recursively, a member of a subaggregate or contained union), or

— a character type.

So the memcpy() itself is well-defined.

Sine uint64_t bits has a declared type, it retains its type even though its object representation was copied from a double.

As chqrlie points out, uint64_t cannot have trap representations, so accessing bits after the memcpy() is not undefined, provided sizeof(uint64_t) == sizeof(double). However, the value of bits will be implementation-dependent (for example due to endianness).

Conclusion: memcpy() can be used for type-punning, provided that the destination of the memcpy() does have a declared type, i.e. is not allocated by [m/c/re]alloc() or equivalent.

这篇关于可以的memcpy用于类型双关?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆