为什么C / C ++讨厌符号字符这么多? [英] Why C/C++ hate signed char so much?

查看:169
本文介绍了为什么C / C ++讨厌符号字符这么多?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

为什么C允许使用字符类型访问对象:

Why does C allow accessing object using "character type":

6.5防爆pressions (C)

对象应具有其存储的值只能由具有一项下列类型的左值前pression访问:

An object shall have its stored value accessed only by an lvalue expression that has one ofthe following types:


      
  • 字符类型。

  •   

但C ++只允许字符 unsigned char型

3.10左值和右值(C ++)

如果一个程序尝试通过其它一个glvalue比以下类型的行为是未定义的一个访问对象的存储值:

If a program attempts to access the stored value of an object through a glvalue of other than one of the following types the behavior is undefined:


      
  • 一个char或unsigned char类型。

  •   

(从C ++标准的报价)的符号字符仇恨的另一部分:

Another portion of signed char hatred (quote from C++ standard):

3.9类型(C ++)

有关的任​​何对象(比基级子对象等)平凡能够复制T类型,对象是否成立类型T的有效的值,构成对象的底层字节可以被复制到的阵列<强>字符或 unsigned char型。如果数组内容的字符 unsigned char型复制回对象,该对象随后应保持其原有的价值。

For any object (other than a base-class subobject) of trivially copyable type T, whether or not the object holds a valid value of type T, the underlying bytes making up the object can be copied into an array of char or unsigned char. If the content of the array of char or unsigned char is copied back into the object, the object shall subsequently hold its original value.

和来自C标准:

6.2.6重新$ P $类型psentations (C)

存储在任何其他对象类型的非位字段的对象的值是由n个×CHAR_BIT位,其中n是该类型的对象的大小,以字节为单位。该值可以被复制到类型的对象的 unsigned char型 [N](例如,则memcpy);由此产生的字节集称为重价值presentation的对象。

Values stored in non-bit-field objects of any other object type consist of n × CHAR_BIT bits, where n is the size of an object of that type, in bytes. The value may be copied into an object of type unsigned char [n] (e.g., by memcpy); the resulting set of bytes is called the object representation of the value.

我可以看到很多人对计算器说,是因为 unsigned char型是用于保证没有填充位的唯一字符类型,但C99科 6.2.6.2整数类型说,

I can see many people on stackoverflow saying that is because unsigned char is the only character type that guaranteed to not have padding bits, but C99 Section 6.2.6.2 Integer types says

符号字符不得有任何填充位

signed char shall not have any padding bits

那么,这背后的真正原因是什么?

So what is the real reason behind this?

推荐答案

下面是我采取的动机:

在非二进制补码系统,符号字符将不适合访问对象的再presentation。这是因为无论有两种可能的符号字符重新具有相同的值(+0和-0),或者一个重新presentation有presentations没有值(一个陷阱重新presentation)。在这两种情况下,这prevents你做最有意义的事情,你可能会与对象的再presentation做。例如,如果你有一个16位无符号整数 0x80ff ,一个或另一个字节,为符号字符 ,将要或陷阱或比较等于0

On a non-twos-complement system, signed char will not be suitable for accessing the representation of an object. This is because either there are two possible signed char representations which have the same value (+0 and -0), or one representation that has no value (a trap representation). In either case, this prevents you from doing most meaningful things you might do with the representation of an object. For example, if you have a 16-bit unsigned integer 0x80ff, one or the other byte, as a signed char, is going to either trap or compare equal to 0.

请注意,在这样一个实现(非二进制补码),纯字符需要被定义为一个无符号类型访问对象的重新presentations通过字符才能正常工作。虽然没有明确的要求,我认为这是从标准的其他要求派生的需求。

Note that on such an implementation (non-twos-complement), plain char needs to be defined as an unsigned type for accessing the representations of objects via char to work correctly. While there's no explicit requirement, I see this as a requirement derived from other requirements in the standard.

这篇关于为什么C / C ++讨厌符号字符这么多?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆