什么原因导致char在使用gcc时被签名或未签名? [英] What causes a char to be signed or unsigned when using gcc?

查看:142
本文介绍了什么原因导致char在使用gcc时被签名或未签名?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果C(使用gcc)中的 char 有符号或无符号,会导致什么原因?我知道这个标准并没有规定另一个,我可以从限制中检查 CHAR_MIN CHAR_MAX 。 h但是我想知道当使用gcc时触发哪一个



如果我从libgcc-6读取limits.h,看到有一个宏它定义了一个默认字符有符号或无符号,但我不确定这是否是编译器在构建时设置的。



我尝试使用

  $ gcc -dM -E -xc / dev / null |列出GCC预定义的makros。 grep -i CHAR 
#define __UINT_LEAST8_TYPE__ unsigned char
#define __CHAR_BIT__ 8
#define __WCHAR_MAX__ 0x7fffffff
#define __GCC_ATOMIC_CHAR_LOCK_FREE 2
#define __GCC_ATOMIC_CHAR32_T_LOCK_FREE 2
# define __SCHAR_MAX__ 0x7f
#define __WCHAR_MIN__(-__ WCHAR_MAX__ - 1)
#define __UINT8_TYPE__ unsigned char
#define __INT8_TYPE__ signed char
#define __GCC_ATOMIC_WCHAR_T_LOCK_FREE 2
#define __CHAR16_TYPE__ short unsigned int
#define __INT_LEAST8_TYPE__ signed char
#define __WCHAR_TYPE__ int
#define __GCC_ATOMIC_CHAR16_T_LOCK_FREE 2
#define __SIZEOF_WCHAR_T__ 4
#define __INT_FAST8_TYPE__ signed char
#define __CHAR32_TYPE__无符号整数
#define __UINT_FAST8_TYPE__无符号字符

但无法找到



背景:我在两台不同的机器上编译了一些代码:

桌面个人电脑:


  • Debian GNU / Linux 9.1(stretch)

  • gcc版本6.3.0 20170516(Debian 6.3.0-18)
  • >
  • Intel(R)Core(TM)i3-4150

  • libgcc-6-dev:6.3.0-18
  • char 已签名



Raspberry Pi3


  • Raspbian GNU / Linux 9.1(stretch)

  • gcc version 6.3.0 20170516(Raspbian
  • ARMv7处理器rev 4(v7l)
  • libgcc-6-dev:6.3.0-18 + rpi

  • char 是无符号的


    所以唯一明显的区别就是CPU架构...

    解决方案

    根据 C11 标准(阅读 n1570 ), char 可以是 signed 无符号(所以你实际上有tw o C口味)。具体的实现是什么。



    一些处理器指令集架构应用程序二进制接口支持 signed 字符(字节)类型(例如,因为它映射的很好到一些机器代码指令),其他的赞成 unsigned $。
    $ b

    gcc 甚至有一些 -fsigned-char -funsigned-char option ,你应该几乎不会使用它(因为改变它会在 C标准库



    您可以使用 feature_test_macros(7)< endian.h> (参见 endian(3))或 autoconf 在Linux上检测你的系统。



    在大多数情况下,你应该写便携式 C代码,它不依赖于这些东西。你可以找到跨平台的库(例如 glib )来帮助你。



    BTW gcc -dM -E -xc / dev / null 也给出 __ BYTE_ORDER __ 等,如果你想要一个无符号的8位字节,你应该使用< stdint.h> 和它的 uint8_t (更便携,更可读)。标准的 limits.h 定义 CHAR_MIN SCHAR_MIN CHAR_MAX SCHAR_MAX 可以比较它们是否相等,以检测 signed char s的实现),等等......

    顺便说一句,你应该关心关于字符编码,但现在大多数系统都使用无处不在的UTF-8 。像 libunistring 这样的库很有帮助。另请参阅,并记住,实际上, Unicode 字符,编码为 UTF-8 可以跨越几个字节(即 char -s)。


    What causes if a char in C (using gcc) is signed or unsigned? I know that the standard doesn't dictate one over the other and that I can check CHAR_MIN and CHAR_MAX from limits.h but I want to know what triggers one over the other when using gcc

    If I read limits.h from libgcc-6 I see that there is a macro __CHAR_UNSIGNED__ which defines a "default" char signed or unsigned but I'm unsure if this is set by the compiler at (his) built time.

    I tried to list GCCs predefined makros with

    $ gcc -dM -E -x c /dev/null | grep -i CHAR
    #define __UINT_LEAST8_TYPE__ unsigned char
    #define __CHAR_BIT__ 8
    #define __WCHAR_MAX__ 0x7fffffff
    #define __GCC_ATOMIC_CHAR_LOCK_FREE 2
    #define __GCC_ATOMIC_CHAR32_T_LOCK_FREE 2
    #define __SCHAR_MAX__ 0x7f
    #define __WCHAR_MIN__ (-__WCHAR_MAX__ - 1)
    #define __UINT8_TYPE__ unsigned char
    #define __INT8_TYPE__ signed char
    #define __GCC_ATOMIC_WCHAR_T_LOCK_FREE 2
    #define __CHAR16_TYPE__ short unsigned int
    #define __INT_LEAST8_TYPE__ signed char
    #define __WCHAR_TYPE__ int
    #define __GCC_ATOMIC_CHAR16_T_LOCK_FREE 2
    #define __SIZEOF_WCHAR_T__ 4
    #define __INT_FAST8_TYPE__ signed char
    #define __CHAR32_TYPE__ unsigned int
    #define __UINT_FAST8_TYPE__ unsigned char
    

    but wasn't able to find __CHAR_UNSIGNED__

    Background: I've some code which I compile on two different machines:

    Desktop PC:

    • Debian GNU/Linux 9.1 (stretch)
    • gcc version 6.3.0 20170516 (Debian 6.3.0-18)
    • Intel(R) Core(TM) i3-4150
    • libgcc-6-dev: 6.3.0-18
    • char is signed

    Raspberry Pi3:

    • Raspbian GNU/Linux 9.1 (stretch)
    • gcc version 6.3.0 20170516 (Raspbian 6.3.0-18+rpi1)
    • ARMv7 Processor rev 4 (v7l)
    • libgcc-6-dev: 6.3.0-18+rpi
    • char is unsigned

    So the only obvious difference is the CPU architecture...

    解决方案

    According to the C11 standard (read n1570), char can be signed or unsigned (so you actually have two flavors of C). What exactly it is is implementation specific.

    Some processors and instruction set architectures or application binary interfaces favor a signed character (byte) type (e.g. because it maps nicely to some machine code instruction), other favor an unsigned one.

    gcc has even some -fsigned-char or -funsigned-char option which you should almost never use (because changing it breaks some corner cases in calling conventions and ABIs) unless you recompile everything, including your C standard library.

    You could use feature_test_macros(7) and <endian.h> (see endian(3)) or autoconf on Linux to detect what your system has.

    In most cases, you should write portable C code, which does not depend upon those things. And you can find cross-platform libraries (e.g. glib) to help you in that.

    BTW gcc -dM -E -x c /dev/null also gives __BYTE_ORDER__ etc, and if you want an unsigned 8 bit byte you should use <stdint.h> and its uint8_t (more portable and more readable). And standard limits.h defines CHAR_MIN and SCHAR_MIN and CHAR_MAX and SCHAR_MAX (you could compare them for equality to detect signed chars implementations), etc...

    BTW, you should care about character encoding, but most systems today use UTF-8 everywhere. Libraries like libunistring are helpful. See also this and remember that practically speaking an Unicode character encoded in UTF-8 can span several bytes (i.e. char-s).

    这篇关于什么原因导致char在使用gcc时被签名或未签名?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆