为什么GCC的-Wconversion对于char和unsigned char的行为有所不同? [英] Why does GCC's -Wconversion behave differently for char vs. unsigned char?

查看:743
本文介绍了为什么GCC的-Wconversion对于char和unsigned char的行为有所不同?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述



  U8 foo(U8 x,U8 y){
return x%y;

$ / code>

如果U8是x和y的类型,则GCC的-Wconversion行为会有所不同char或unsigned char:

gcc -Wconversion -c test.c -DU8 ='unsigned char'

  gcc -Wconversion -c test.c  - DU8 = char 
test.c:函数'foo':
test.c:2:14:警告:从'int'转换为'char'可能会改变它的值[-Wconversion]
return x%y;
~~ ^ ~~



为什么会有差异?



奖金问题:如果您启用了ubsan(-fsanitize = undefined),那么GCC会在两种情况下发出-WConversion。


$ b

编辑



没有任何论据表明x,y经历了整数提升,然后需要转换为结果类型,所以不需要解释它。



这里唯一的问题是,为什么GCC针对不同类型的行为不同?答案将涉及到GCC内部的一些洞察。

>仅使用有关所涉及类型的信息,因为从 int (更大类型)的转换, gcc char / unsigned char (更小的类型)

使用关于可能值的信息(范围分析) gcc 应该警告没有,因为 x%y 的结果,即使在升级为int之后,也会始终适合与 x y

$相同的类型。 b
$ b

因此,在第一种情况下, gcc 可以断言操作不会导致值更改,但由于某些原因对于第二种情况做这件事。



作为一个方面说明,clang并没有提醒任何人。






类型系统




  • 在被测系统(x86-64)上,字符类型已签名。请注意,它与 signed char 不同。


  • x%y 由于整数提升规则,在这两种情况下, x y 是提升为 int 。结果 x%y 的类型是 int


  • 如果我们明确地隐式转换所有隐式转换,那么我们得到这个结果:

    $ unsigned char foo1(unsigned char x,unsigned char y)
    {
    return(unsigned char)((int)x%(int)y);

    $ b $ char foo2(char x,char y)
    {
    return(char)((int)x%(int)y);


  • int char unsigned char signed char -Wconversion 引发警告:
    $ b


    -Wconversion



    警告可能会改变值的隐式转换。这包括
    [..]和转换为较小类型

    事实上,这两个函数都会导致生成警告:

      char bar1(int a)
    {
    return a; //警告:从'int'到'char'的转换可能会改变值[-Wconversion]
    }

    unsigned char bar2(int a)
    {
    return一个; //警告:从'int'到'unsigned char'的转换可能会改变值[-Wconversion]
    }





  • 因此只使用类型信息我们应该得到一个警告,因为我们的2个函数有一个隐式转换 int char / unsigned char 就像 bar1 bar2



    价值分析



    符号 r = x%y 然后 r x | r | ∈[0,| y |)




    • if x y 类型为无符号字符,则 r∈[0 ,CHAR_MAX)



      r 符合无符号炭。因此,不需要任何警告。

    • $ c>类型 char




      • CHAR_MIN = -CHAR_MAX - 1

      • max(| y |)= CHAR_MAX + 1

      • | r | ∈[0,max(| y |))

      • | r | ∈[0,CHAR_MAX + 1)

      • r∈(-CHAR_MAX - 1,CHAR_MAX + 1)



      r 适合于 char code>因此不需要警告。



    所以我争论的是即使在所有的整数升级和隐式转换之后, x%y 总是适合 U8






    你可以看看 godbolt


    Consider

    U8 foo(U8 x, U8 y) {
        return x % y;
    }
    

    GCC's -Wconversion behaves differently if U8, the type of x and y, is char or unsigned char:

    gcc -Wconversion -c test.c -DU8='unsigned char'

    (no warning)

    gcc -Wconversion -c test.c -DU8=char
    test.c: In function ‘foo’:
    test.c:2:14: warning: conversion to ‘char’ from ‘int’ may alter its value [-Wconversion]
         return x % y;
                ~~^~~
    

    But from what I understand in both cases x, y undergo integer promotion (to int or unsigned int) and so in both cases it will be converting from int to whatever the return type is (char or unsigned char).

    Why is there a difference?

    Bonus question: if you enable ubsan (-fsanitize=undefined) then GCC emits -Wconversion in both cases.

    EDIT:

    There is no argument that x, y undergo integer promotion and then need to be converted to the result type, so no need to explain that.

    The only question here is why does GCC behave differently for different types. The answer will involve some insight on GCC's internals.

    解决方案

    TLDR

    using information only about the types involved, gcc should warn for both cases because of conversion from int (larger type) to char/unsigned char (smaller types)

    Using also information about the possible values (range analysis) gcc should warn for none because the result of x % y, even after promotions to int, will always fit back to the same type as x and y.

    So it seems that in the first case gcc can assert that the operations will never result in a value change, but for some reason cannot do that for the second case.

    As a side note, clang does not warn for any.


    Type system

    • On the tested system (x86-64) the char type is signed. Please be aware that it still a different type than signed char.

    • x % y Due to integer promotion rules, in both cases, x and y are promoted to int. The result x % y is of type int.

    • If we make all the implicit conversions explicit then we get this:

      unsigned char foo1(unsigned char x, unsigned char y)
      {
         return (unsigned char)((int) x % (int) y);
      }
      
      char foo2(char x, char y)
      {
         return (char)((int) x % (int) y);
      }
      

    • Implicit conversion from int to char, unsigned char and to signed char fires the warning with -Wconversion:

      -Wconversion

      Warn for implicit conversions that may alter a value. This includes [..] and conversions to smaller types

      Indeed both these functions result in a warning getting generated:

      char bar1(int a)
      {
         return a; // warning: conversion from 'int' to 'char' may change value [-Wconversion]
      }
      
      unsigned char bar2(int a)
      {
         return a;  // warning: conversion from 'int' to 'unsigned char' may change value [-Wconversion]
      }
      

    So using type information only we should get a warning for both because our 2 functions have an implicit conversion from int to char/unsigned char just like bar1 and bar2.

    Value analysis

    If we use the notation r = x % y then r has the same sign as x and |r| ∈ [0, |y|).

    • if x and y are of type unsigned char then r ∈ [0, CHAR_MAX).

      r fits in an unsigned char. So no warning needed.

    • if x and y are of type char:

      • CHAR_MIN = -CHAR_MAX - 1
      • max(|y|) = CHAR_MAX + 1
      • |r| ∈ [0, max(|y|))
      • |r| ∈ [0, CHAR_MAX + 1)
      • r ∈ (-CHAR_MAX - 1, CHAR_MAX + 1)

      r fits in a char so no warning needed.

    So what I am arguing is that the result of x % y always fits in an U8 even after all the integer promotions and implicit conversions.


    You can have a look at this godbolt

    这篇关于为什么GCC的-Wconversion对于char和unsigned char的行为有所不同?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆