如何避免在C中使用整数提升? [英] How to avoid integer promotion in C?

查看:89
本文介绍了如何避免在C中使用整数提升?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

目前尚不清楚如何使用宽字符API在C语言中编写可移植代码.考虑以下示例:

It is not clear how to write portable code in C, using wide-character API. Consider this example:

#include <locale.h>
#include <wchar.h>
#include <wctype.h>
int main(void)
{
  setlocale(LC_CTYPE, "C.UTF-8");
  wchar_t wc = L'ÿ';
  if (iswlower(wc)) return 0;
  return 1;
}

使用-Wconversion选项在gcc-6.3.0中进行编译会给出以下警告:

Compiling it with gcc-6.3.0 using -Wconversion option gives this warning:

test.c: In function 'main':
test.c:9:16: warning: conversion to 'wint_t {aka unsigned int}' from 'wchar_t {aka int}' may change the sign of the result [-Wsign-conversion]
if (iswlower(wc)) return 0;
             ^

要摆脱此警告,我们像iswlower((wint_t)wc)一样强制转换为(wint_t),但这是不可移植的. 下面的示例演示了为什么它不可移植.

To get rid of this warning, we cast to (wint_t), like iswlower((wint_t)wc), but this is unportable. The following example demonstrates why it is unportable.

#include <stdio.h>

/* this is our hypothetical implementation */
typedef signed int wint_t;
typedef signed short wchar_t;
#define WEOF ((wint_t)0xffffffff)

void f(wint_t wc)
{
    if (wc==WEOF)
      printf("BUG. Valid character recognized as WEOF. This is due to integer promotion. How to avoid it?\n");
}
int main(void)
{
    wchar_t wc = (wchar_t)0xffff;
    f((wint_t)wc);
    return 0;
}

我的问题是:如何使此示例具有可移植性,同时避免出现gcc警告.

My question is: how to make this example portable, and at the same time avoid the gcc warning.

推荐答案

为简单起见,我将假设我正在讨论的平台/实现具有以下特征:

To keep things simple, I'm going to assume that the platform/implementation I'm discussing has the following characteristics:

  • 二进制补码整数类型
  • int是32位
  • short是16位
  • two's complement integer types
  • int is 32 bits
  • short is 16 bits

我也将使用C99作为参考,因为这是我已经打开的内容.

I'm also going to use C99 as a reference just because it's what I have open.

标准说,这些类型/宏必须满足以下条件:

The standard says the following must be true about these types/macros:

  • wint_t必须至少具有一个与扩展字符集(7.24.1/2)的任何成员都不对应的值
  • WEOF的值与扩展字符集的任何成员(7.24.1/3)不对应
  • wchar_t可以代表最大扩展字符集(7.17/2)的所有值
  • wint_t must be able to have at least one value that does not correspond to any member of the extended character set (7.24.1/2)
  • WEOF has a value that does not correspond to any member of the extended character set (7.24.1/3)
  • wchar_t can represent all values of the largest extended character set (7.17/2)

请记住,按照C标准对值"的定义,(short int) 0xffff的值与(int) 0xffffffff的值是相同 -即它们都具有值(鉴于此答案开头所述的假设).通过标准对整数促销(6.3.1.1)的描述可以清楚地看出这一点:

Keep in mind that by the C standard's definition of "value", the value of (short int) 0xffff is the same as the value of (int) 0xffffffff - that is they both have the value -1 (given the assumptions stated at the beginning of this answer). This is made clear by the standard's description of the integer promotions (6.3.1.1):

如果一个int可以表示原始类型的所有值,则该值将转换为int;否则,它将转换为unsigned int.这些称为整数促销.整数促销未更改所有其他类型.

If an int can represent all values of the original type, the value is converted to an int; otherwise, it is converted to an unsigned int. These are called the integer promotions. All other types are unchanged by the integer promotions.

整数促销保留包括符号在内的价值.

The integer promotions preserve value including sign.

我相信,当您组合这些元素时,如果WEOF具有值-1,则扩展字符集中的任何项目都不能具有值-1.我认为这意味着在您的实现示例中,wchar_t必须是无符号的(如果它仍然是16位类型)或(wchar_t) 0xffff不能是有效字符.

I believe that when you combine these elements it seems that if WEOF has the value -1, then no item in an extended character set can have the value -1. I think that this means that in your implementation example, either wchar_t would have to be unsigned (if it remained a 16-bit type) or (wchar_t) 0xffff could not be a valid character.

但是我最初忘记的另一种选择(可能是示例实现的最佳解决方案)是,标准在脚注中指出宏WEOF的值可能与EOF的值不同,并且不必为负".因此,可以通过使用WEOF == INT_MAX来解决实现问题.这样,它不能具有与任何wchar_t相同的值.

But there's another alternative that I originally forgot (and is probably the best solution for your example implementation) is that the standard states in a footnote that the "value of the macro WEOF may differ from that of EOF and need not be negative". So your implementation's problem can be fixed by making WEOF == INT_MAX for example. That way it cannot have the same value as any wchar_t.

WEOF可能与有效字符值重叠的值是我认为可能在实际实现中出现的一个值(即使该标准似乎禁止了它),它类似于关于EOF的问题可能与某些有效的带符号char值具有相同的值.

The WEOF value possibly overlapping with a valid character value is one that I suppose might occur in real implementations (even if the standard seems to prohibit it), and it's similar to issues that have been brought up regarding EOF possibly having the same value as some valid signed char value.

对于大多数可以返回WEOF来指示某种问题的(全部?)函数,可能有意思的是,标准要求该函数设置有关错误或条件的一些附加指示(例如,设置errno设置为特定值,或在流上设置文件结尾指示符.

It might be of interest that for most (all?) functions that can return WEOF to indicate some sort of problem, the standard requires that the function set some addition indication about the error or condition (for example, setting errno to a particular value, or setting the end-of-file indicator on a stream).

要注意的另一件事是,据我了解,0xffff在UCS-2或UTF-16中是非字符(不知道可能存在的其他任何16位编码).

Another thing to note is that it's my understanding that 0xffff is a non-character in UCS-2 or UTF-16 (no idea about any other 16-bit encodings that might exist).

这篇关于如何避免在C中使用整数提升?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆