隐式类型提升规则 [英] Implicit type promotion rules

查看:92
本文介绍了隐式类型提升规则的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

此帖子旨在用作有关C语言中隐式整数提升(尤其是由通常的算术转换和/或整数提升引起的隐式提升)的常见问题解答。

示例1)

为什么给出一个奇怪的大整数而不是255?

 无符号字符x = 0; 
无符号字符y = 1;
printf(%u\n,x-y);

示例2)

为什么给出 -1大于0 ?

  unsigned int a = 1; 
signed int b = -2;
if(a + b> 0)
puts(-1大于0);

示例3)

为什么将上面示例中的类型更改为 short 解决问题了吗?

  unsigned short a = 1; 
有符号短b = -2;
if(a + b> 0)
puts(-1大于0); //将不会打印

(这些示例适用于16位短的32位或64位计算机。)

解决方案

C旨在隐式和无声地更改表达式中使用的操作数的整数类型。在几种情况下,语言会迫使编译器将操作数更改为更大的类型,或者更改其符号。



这样做的理由是为了防止在算术期间意外溢出,还允许具有不同符号的操作数在同一表达式中共存。



不幸的是,隐式类型提升的规则弊大于利,以至于它们可能是C语言中最大的缺陷之一。这些规则通常对于普通C程序员来说甚至都不为人所知,因此会引起各种非常细微的错误。



通常情况下,您会看到程序员说只是强制转换为x并起作用的情况-但他们不知道为什么。或者,这些错误表明自己是罕见的,间歇性的现象,这些现象是从看似简单而直接的代码中发出的。隐式提升在执行位操作的代码中特别麻烦,因为在C中,大多数位操作符在给定有符号操作数时都会具有定义不明确的行为。






整数类型和转换等级



C中的整数类型为 char int long long enum

_Bool / bool 在类型促销中也被视为整数类型。



所有整数都具有指定的转换排名。 C11 6.3.1.1,重点介绍最重要的部分:


每个整数类型的整数转换等级定义如下:

—即使两个有符号整数类型具有相同的表示形式,也不应具有相同的等级。

—有符号整数类型的等级应大于任何有符号整数类型的等级

- long int 的等级应大于 long int ,应大于 int 的等级,应大于 short int ,该值应大于有符号字符的等级。

—任何无符号整数类型的等级均应等于相应整数的等级有符号整数类型(如果有的话)。


—任何标准整数类型的秩都应大于宽度相同的任何扩展整数类型的秩。

— char sh的等级

— _Bool的等级应小于所有其他标准整数类型的等级。

—任何枚举类型的等级应等于兼容整数类型的等级(见6.7.2.2)。


stdint.h 中的类型也在这里排序,并且相同排序为它们在给定系统上碰巧对应的任何类型。例如,在32位系统上, int32_t int 具有相同的排名。



此外,C11 6.3.1.1指定哪些类型被视为小整数类型(不是正式术语):


以下表达式可用于 int unsigned int 的表达式中可以使用



-具有整数类型( int unsigned int ),其整数转换等级小于或等于 int 的等级unsigned int


在实践中,这个有点神秘的文本是 _Bool char short (还有 int8_t uint8_t 等)是小整数类型。这些将以特殊的方式处理,并受到隐式促销的影响,如下所述。






整数促销



每当在表达式中使用小整数类型时,它都会隐式转换为 int 签。这就是整型促销整型促销规则



该规则通常表示为(C11 6.3 .1.1):


如果 int 可以表示原始类型的所有值(受位字段的宽度限制),该值将转换为 int ;否则,它将转换为 unsigned int 。这些称为整数促销


这意味着所有小整数类型,无论有无符号,都隐式转换为(有符号) int 用于大多数表达式。



此文本经常被误解为:所有小的带符号整数类型都转换为带符号的int,而所有小的,无符号整数类型将转换为无符号int。这是不正确的。这里的无符号部分仅意味着,例如,如果我们有 unsigned short 操作数,而 int 恰好具有相同的在给定系统上的大小为 short ,然后将 unsigned short 操作数转换为 unsigned int 。在这种情况下,什么都没有真正发生。但是,如果 short 是小于 int 的类型,则始终将其转换为(带符号) int 无论短短是有符号的还是无符号的



整数促销引起的严酷现实表示在C中几乎无法对 char short 之类的小类型进行操作。操作始终在 int 或更大的类型上进行。



这听起来像是胡说八道,但是幸运的是,编译器可以优化代码。例如,一个包含两个 unsigned char 操作数的表达式会将操作数提升为 int ,并且该操作以 int 。但是,可以预期,编译器可以优化表达式,使其实际上以8位运算的形式执行。但是,问题来了:不允许 编译器优化由整数提升引起的有符号性的隐式更改。因为编译器无法判断程序员是否故意依赖隐式升级发生,还是无意的。



这就是为什么示例1中的示例问题失败。两个无符号char操作数都提升为 int 类型,对 int 类型进行运算,结果为 x-y 的类型为 int 。这意味着我们得到的是 -1 而不是预期的 255 。编译器可能会生成使用8位指令而不是 int 来执行代码的机器代码,但是它可能无法优化签名的更改。这意味着我们最终得到一个负数的结果,当调用 printf(%u )时,结果反而会产生一个怪异的数字。示例1可以通过强制转换为将操作返回为 unsigned char



除了一些特殊情况,例如 ++ sizeof 运算符,整数提升适用于C中的几乎所有运算,无论一元,二进制(或三元)运算符是






通常的算术转换



每当在C中执行二进制操作(带有2个操作数的操作)时,该操作符的两个操作数都必须具有相同的类型,因此,如果操作数具有不同的类型,则C会强制执行一个操作数与另一个操作数的类型有关。完成该操作的规则称为通常的人工转换(有时非正式地称为平衡)。 C11 6.3.18:



(将此规则视为长嵌套的 if-else if 语句,也许更容易阅读:))


6.3.1.8常规算术转换



许多期望算术类型的操作数的运算符都以相似的方式导致转换并产生
类型的结果。目的是确定操作数
和结果的公共实型。对于指定的操作数,每个操作数在不更改
类型域的情况下被转换为其对应的实类型是普通实类型的类型。除非
另有明确说明,否则普通实型也是
对应的实型结果,如果操作数相同,则其类型域为操作数的类型域,否则为
和复数。这种模式称为通常的算术转换




  • 首先,如果任一操作数的对应实型为 long double ,另一个操作数在不更改类型域的情况下转换为对应的实型为 long double 的类型

  • 否则,如果两个操作数的对应实型为 double ,则将转换另一个操作数,而不会更改类型域,

  • 否则,如果任一操作数的对应实型为 double 的类型。 > float ,另一个操作数将被转换为对应的实际类型为float的类型,而不会改变类型域。

  • 否则,该整数提升在两个操作数上执行。然后,将以下规则的
    应用于提升后的操作数:




    • 如果两个操作数具有相同的类型,则不再进行转换

    • 否则,如果两个操作数都具有符号整数类型或都具有无符号的
      整数类型,则具有较小整数转换等级的操作数是
      转换为

    • 否则,如果具有无符号整数类型的操作数的等级大于或等于
      等于另一个操作数的类型的等级,那么将
      带符号整数类型的操作数转换为无符号
      整数类型的操作数的类型。

    • 否则,如果带符号整数的操作数的类型type可以表示
      无符号整数类型的所有操作数类型的所有值,然后将
      无符号整数类型的操作数的值转换为
      有符号整数类型的操作数的类型。

    • 否则,两个操作数都将转换为与带符号整数类型的操作数类型相对应的无符号整数类型



此处值得注意的是,通常的算术转换适用于浮点数和整数变量。如果是整数,我们还可以注意到,整数提升是从常规算术转换中调用的。然后,当两个操作数的秩至少为 int 时,运算符将被平衡为具有相同符号的相同类型。



这就是为什么示例2中的 a + b 给出奇怪结果的原因。这两个操作数都是整数,并且它们的排名至少为 int ,因此整数提升不适用。操作数不是同一类型- a unsigned int b signed int 。因此,运算符 b 被临时转换为 unsigned int 类型。在此转换过程中,它会丢失符号信息并最终产生较大的价值。



将类型更改为 short 在示例3中解决了该问题,这是因为 short 是一个小整数类型。这意味着两个操作数都是整数,提升为带符号的 int 类型。整数提升后,两个操作数具有相同的类型( int ),无需进一步转换。然后可以按预期对带符号的类型执行该操作。


This post is meant to be used as a FAQ regarding implicit integer promotion in C, particularly implicit promotion caused by the usual arithmetic conversions and/or the integer promotions.

Example 1)
Why does this give a strange, large integer number and not 255?

unsigned char x = 0;
unsigned char y = 1;
printf("%u\n", x - y); 

Example 2)
Why does this give "-1 is larger than 0"?

unsigned int a = 1;
signed int b = -2;
if(a + b > 0)
  puts("-1 is larger than 0");

Example 3)
Why does changing the type in the above example to short fix the problem?

unsigned short a = 1;
signed short b = -2;
if(a + b > 0)
  puts("-1 is larger than 0"); // will not print

(These examples were intended for a 32 or 64 bit computer with 16 bit short.)

解决方案

C was designed to implicitly and silently change the integer types of the operands used in expressions. There exist several cases where the language forces the compiler to either change the operands to a larger type, or to change their signedness.

The rationale behind this is to prevent accidental overflows during arithmetic, but also to allow operands with different signedness to co-exist in the same expression.

Unfortunately, the rules for implicit type promotion cause much more harm than good, to the point where they might be one of the biggest flaws in the C language. These rules are often not even known by the average C programmer and therefore causing all manner of very subtle bugs.

Typically you see scenarios where the programmer says "just cast to type x and it works" - but they don't know why. Or such bugs manifest themselves as rare, intermittent phenomenon striking from within seemingly simple and straight-forward code. Implicit promotion is particularly troublesome in code doing bit manipulations, since most bit-wise operators in C come with poorly-defined behavior when given a signed operand.


Integer types and conversion rank

The integer types in C are char, short, int, long, long long and enum.
_Bool/bool is also treated as an integer type when it comes to type promotions.

All integers have a specified conversion rank. C11 6.3.1.1, emphasis mine on the most important parts:

Every integer type has an integer conversion rank defined as follows:
— No two signed integer types shall have the same rank, even if they have the same representation.
— The rank of a signed integer type shall be greater than the rank of any signed integer type with less precision.
— The rank of long long int shall be greater than the rank of long int, which shall be greater than the rank of int, which shall be greater than the rank of short int, which shall be greater than the rank of signed char.
— The rank of any unsigned integer type shall equal the rank of the corresponding signed integer type, if any.

— The rank of any standard integer type shall be greater than the rank of any extended integer type with the same width.
— The rank of char shall equal the rank of signed char and unsigned char.
— The rank of _Bool shall be less than the rank of all other standard integer types.
— The rank of any enumerated type shall equal the rank of the compatible integer type (see 6.7.2.2).

The types from stdint.h sort in here too, with the same rank as whatever type they happen to correspond to on the given system. For example, int32_t has the same rank as int on a 32 bit system.

Further, C11 6.3.1.1 specifies which types that are regarded as the small integer types (not a formal term):

The following may be used in an expression wherever an int or unsigned int may be used:

— An object or expression with an integer type (other than int or unsigned int) whose integer conversion rank is less than or equal to the rank of int and unsigned int.

What this somewhat cryptic text means in practice, is that _Bool, char and short (and also int8_t, uint8_t etc) are the "small integer types". These are treated in special ways and subject to implicit promotion, as explained below.


The integer promotions

Whenever a small integer type is used in an expression, it is implicitly converted to int which is always signed. This is known as the integer promotions or the integer promotion rule.

Formally, the rule says (C11 6.3.1.1):

If an int can represent all values of the original type (as restricted by the width, for a bit-field), the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions.

This means that all small integer types, no matter signedness, get implicitly converted to (signed) int when used in most expressions.

This text is often misunderstood as: "all small, signed integer types are converted to signed int and all small, unsigned integer types are converted to unsigned int". This is incorrect. The unsigned part here only means that if we have for example an unsigned short operand, and int happens to have the same size as short on the given system, then the unsigned short operand is converted to unsigned int. As in, nothing of note really happens. But in case short is a smaller type than int, it is always converted to (signed) int, regardless of it the short was signed or unsigned!

The harsh reality caused by the integer promotions means that almost no operation in C can be carried out on small types like char or short. Operations are always carried out on int or larger types.

This might sound like nonsense, but luckily the compiler is allowed to optimize the code. For example, an expression containing two unsigned char operands would get the operands promoted to int and the operation carried out as int. But the compiler is allowed to optimize the expression to actually get carried out as an 8 bit operation, as would be expected. However, here comes the problem: the compiler is not allowed to optimize out the implicit change of signedness caused by the integer promotion. Because there is no way for the compiler to tell if the programmer is purposely relying on implicit promotion to happen, or if it is unintentional.

This is why example 1 in the question fails. Both unsigned char operands are promoted to type int, the operation is carried out on type int, and the result of x - y is of type int. Meaning that we get -1 instead of 255 which might have been expected. The compiler may generate machine code that executes the code with 8 bit instructions instead of int, but it may not optimize out the change of signedness. Meaning that we end up with a negative result, that in turn results in a weird number when printf("%u is invoked. Example 1 could be fixed by casting the result of the operation back to type unsigned char.

With the exception of a few special cases like ++ and sizeof operators, the integer promotions apply to almost all operations in C, no matter if unary, binary (or ternary) operators are used.


The usual arithmetic conversions

Whenever a binary operation (an operation with 2 operands) is done in C, both operands of the operator have to be of the same type. Therefore, in case the operands are of different types, C enforces an implicit conversion of one operand to the type of the other operand. The rules for how this is done are named the usual artihmetic conversions (sometimes informally referred to as "balancing"). These are specified in C11 6.3.18:

(Think of this rule as a long, nested if-else if statement and it might be easier to read :) )

6.3.1.8 Usual arithmetic conversions

Many operators that expect operands of arithmetic type cause conversions and yield result types in a similar way. The purpose is to determine a common real type for the operands and result. For the specified operands, each operand is converted, without change of type domain, to a type whose corresponding real type is the common real type. Unless explicitly stated otherwise, the common real type is also the corresponding real type of the result, whose type domain is the type domain of the operands if they are the same, and complex otherwise. This pattern is called the usual arithmetic conversions:

  • First, if the corresponding real type of either operand is long double, the other operand is converted, without change of type domain, to a type whose corresponding real type is long double.
  • Otherwise, if the corresponding real type of either operand is double, the other operand is converted, without change of type domain, to a type whose corresponding real type is double.
  • Otherwise, if the corresponding real type of either operand is float, the other operand is converted, without change of type domain, to a type whose corresponding real type is float.
  • Otherwise, the integer promotions are performed on both operands. Then the following rules are applied to the promoted operands:

    • If both operands have the same type, then no further conversion is needed.
    • Otherwise, if both operands have signed integer types or both have unsigned integer types, the operand with the type of lesser integer conversion rank is converted to the type of the operand with greater rank.
    • Otherwise, if the operand that has unsigned integer type has rank greater or equal to the rank of the type of the other operand, then the operand with signed integer type is converted to the type of the operand with unsigned integer type.
    • Otherwise, if the type of the operand with signed integer type can represent all of the values of the type of the operand with unsigned integer type, then the operand with unsigned integer type is converted to the type of the operand with signed integer type.
    • Otherwise, both operands are converted to the unsigned integer type corresponding to the type of the operand with signed integer type.

Notable here is that the usual arithmetic conversions apply to both floating point and integer variables. In case of integers, we can also note that the integer promotions are invoked from within the usual arithmetic conversions. And after that, when both operands have at least the rank of int, the operators are balanced to the same type, with the same signedness.

This is the reason why a + b in example 2 gives a strange result. Both operands are integers and they are at least of rank int, so the integer promotions do not apply. The operands are not of the same type - a is unsigned int and b is signed int. Therefore the operator b is temporarily converted to type unsigned int. During this conversion it loses the sign information and ends up as a large value.

The reason why changing type to short in example 3 fixes the problem, is because short is a small integer type. Meaning that both operands are integer promoted to type int which is signed. After integer promotion, both operands have the same type (int), no further conversion is needed. And then the operation can be carried out on a signed type as expected.

这篇关于隐式类型提升规则的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆