将unsigned char *转换为char *是安全的,并将取消引用的指针看作是真的指向char? [英] Is it safe to cast an unsigned char* to char*, and treat the dereferenced pointer as if it really points to a char?

查看:211
本文介绍了将unsigned char *转换为char *是安全的,并将取消引用的指针看作是真的指向char?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

按照 由于错误的strcmp参数处理生成警告 的问题,似乎有






问题



这看起来不错,但标准保证(1)总是会产生 true

  char unsigned * p1 = ...; 
char * p2 = reinterpret_cast< char *> (p1)。

* p1 == * p2; //(1)


解决方案

/ h3>

但在 C ++ 11 标准中没有这样的保证( N3337 ),以及即将发布的 C ++ 14 N3797 )。

  char unsigned * p1 = ...; 
char * p2 = reinterpret_cast< char *> (p1)。

* p1 == * p2; //(1),不保证为真

>: 未签名; [basic.fundamental] p1











标准保证每个字符类型都必须相同




  • em>对齐要求;

  • 占用相同的存储量,

  • 类型应参与值表示

  • 是相同的。



共享相同的存储空间,对齐要求以及关于位参与的保证,意味着转换引用一种类型的左值



blockquote>

3.9.1p1 基本类型 [basic.fundamental]


实现定义了 char 是否可以保存负值。字符可以显式声明已签名无符号



$ char, a 占用相同的存储量并具有相同的对齐要求(3.11);也就是说,它们具有相同的对象表示。对于字符类型,对象表示的所有位都参与值表示。



对于无符号字符类型,值表示的所有可能的位模式表示数字。这些要求不适用于其他类型。




3.9p4 类型 .types]


类型<$ c的对象的对象表示 $ c> T 是类型<$ c $的对象所占用的 N 无符号char c> T,其中 N 等于 sizeof(T)。对象的值表示是一组包含 T 的值的位。









>

如果我们指定 unsigned char 的最大值( UCHAR_MAX )至 * p1 * p2 是签名的 * p2 将无法表示此值。我们将溢出 * p2 ,它最有可能的值为 -1



有签名的整数溢出实际上是未定义的行为。






  * p1 = UCHAR_MAX; 

* p1 == * p2; //(1)






c $ c> operator == 必须具有相同的类型,我们才能比较它们,目前一边是 unsigned char 和其他 char



编译器将为此寻求整体提升所有这两种类型的组合可能值;在这种情况下,结果类型将是 int



整数提升后语句在语义上等同于 int(UCHAR_MAX)== int(-1),这当然是假的。


Following the question titled Warning generated due wrong strcmp parameter handling, there seems to be some questions regarding what the Standard actually guarantees regarding value representation of character types.


THE QUESTION

This looks fine, but does the Standard guarantee that the (1) will always yield true?

char unsigned * p1 = ...;
char          * p2 = reinterpret_cast<char *> (p1);

*p1 == *p2; // (1)

解决方案

THIS MIGHT SURPRISE YOU,

but there's no such guarantee in the C++11 Standard (N3337), nor in the upcoming C++14 (N3797).

char unsigned * p1 = ...;
char          * p2 = reinterpret_cast<char *> (p1);

*p1 == *p2; // (1), not guaranteed to be true

Note: it is implementation specific whether char is signed or unsigned; [basic.fundamental]p1.



DETAILS

The Standard guarantees that every character type shall;

  • have the same alignment requirement;
  • occupy the same amount of storage, and;
  • that all bits of the storage occupied by a character type shall participate in the value representation, and;
  • that the value representation is the same.

Sharing the same amount of storage, alignment requirement, and the guarantee about bit participation, means that casting a lvalue referring to one type (unsigned char), to another (char), is safe.. as far as the actual cast is concerned.

3.9.1p1 Fundamental types [basic.fundamental]

It is implementation-defined whether a char can hold negative values. Characters can be explicitly declared signed or unsigned.

A char, a signed char, and an unsigned char occupy the same amount of storage and have the same alignment requirements (3.11); that is, they have the same object representation. For character types, all bits of the object representation participate in the value representation.

For unsigned character types, all possible bit patterns of the value representation represent numbers. These requirements do not hold for other types.

3.9p4 Types [basic.types]

The object representation of an object of type T is the sequence of N unsigned char objects taken up by the object of type T, where N equals sizeof(T). The value representation of an object is the set of bits that hold the value of type T.



SO, WHAT ARE THE PROBLEM(s)?

If we assign the maximum value of an unsigned char (UCHAR_MAX) to *p1 and *p2 is signed, *p2 won't be able to represent this value. We will overflow *p2 and it will, most likely, end up having the value of -1.

Note: signed integer overflow is actually undefined behavior.


*p1 = UCHAR_MAX;

*p1 == *p2; // (1)


Both sides of operator== must have the same type before we can compare them, and currently one side is unsigned char and the other char.

The compiler will therefor resort to integral promotion to find a type that can represent all combined possible values of the two types; and in this case the resulting type will be int.

After the integral promotion the statement is semantically equivalent to int (UCHAR_MAX) == int(-1), which of course is false.

这篇关于将unsigned char *转换为char *是安全的,并将取消引用的指针看作是真的指向char?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆