为什么GCC的-Wconversion对于char和unsigned char的行为有所不同? [英] Why does GCC's -Wconversion behave differently for char vs. unsigned char?
问题描述
U8 foo(U8 x,U8 y){
return x%y;
$ / code>
如果U8是x和y的类型,则GCC的-Wconversion行为会有所不同char或unsigned char:
gcc -Wconversion -c test.c -DU8 ='unsigned char'
(没有警告)
gcc -Wconversion -c test.c - DU8 = char
test.c:函数'foo':
test.c:2:14:警告:从'int'转换为'char'可能会改变它的值[-Wconversion]
return x%y;
~~ ^ ~~
但 为什么会有差异?
奖金问题:如果您启用了ubsan(-fsanitize = undefined),那么GCC会在两种情况下发出-WConversion。
$ b
编辑:
没有任何论据表明x,y经历了整数提升,然后需要转换为结果类型,所以不需要解释它。
这里唯一的问题是,为什么GCC针对不同类型的行为不同?答案将涉及到GCC内部的一些洞察。
>仅使用有关所涉及类型的信息,因为从 int
(更大类型)的转换, gcc
到 char
/ unsigned char
(更小的类型) 使用关于可能值的信息(范围分析) gcc
应该警告没有,因为 x%y
的结果,即使在升级为int之后,也会始终适合与 x
和 y
。
$ b
因此,在第一种情况下, gcc
可以断言操作不会导致值更改,但由于某些原因对于第二种情况做这件事。
作为一个方面说明,clang并没有提醒任何人。
类型系统
-
在被测系统(x86-64)上,
字符
类型已签名。请注意,它与signed char
不同。 -
x%y
由于整数提升规则,在这两种情况下,x
和y
是提升为int
。结果x%y
的类型是int
。 -
如果我们明确地隐式转换所有隐式转换,那么我们得到这个结果:
$ unsigned char foo1(unsigned char x,unsigned char y)
{
return(unsigned char)((int)x%(int)y);
$ b $ char foo2(char x,char y)
{
return(char)((int)x%(int)y);
-
从
int
到char
,unsigned char
和signed char
用-Wconversion
引发警告:
$ b
-Wconversion
警告可能会改变值的隐式转换。这包括
[..]和转换为较小类型
事实上,这两个函数都会导致生成警告:
char bar1(int a)
{
return a; //警告:从'int'到'char'的转换可能会改变值[-Wconversion]
}
unsigned char bar2(int a)
{
return一个; //警告:从'int'到'unsigned char'的转换可能会改变值[-Wconversion]
}
-
if
x
和y
类型为无符号字符
,则r∈[0 ,CHAR_MAX)
。
r
符合无符号炭
。因此,不需要任何警告。 $ c>类型 -
CHAR_MIN = -CHAR_MAX - 1
-
max(| y |)= CHAR_MAX + 1
-
| r | ∈[0,max(| y |))
-
| r | ∈[0,CHAR_MAX + 1)
-
r∈(-CHAR_MAX - 1,CHAR_MAX + 1)
因此只使用类型信息我们应该得到一个警告,因为我们的2个函数有一个隐式转换 int
到 char
/ unsigned char
就像 bar1
和 bar2
。
价值分析
符号 r = x%y
然后 r
与 x
和 | r | ∈[0,| y |)
。
char
: r
适合于 char code>因此不需要警告。
所以我争论的是即使在所有的整数升级和隐式转换之后, x%y
总是适合 U8
。
你可以看看 godbolt
Consider
U8 foo(U8 x, U8 y) {
return x % y;
}
GCC's -Wconversion behaves differently if U8, the type of x and y, is char or unsigned char:
gcc -Wconversion -c test.c -DU8='unsigned char'
(no warning)
gcc -Wconversion -c test.c -DU8=char
test.c: In function ‘foo’:
test.c:2:14: warning: conversion to ‘char’ from ‘int’ may alter its value [-Wconversion]
return x % y;
~~^~~
But from what I understand in both cases x, y undergo integer promotion (to int or unsigned int) and so in both cases it will be converting from int to whatever the return type is (char or unsigned char).
Why is there a difference?
Bonus question: if you enable ubsan (-fsanitize=undefined) then GCC emits -Wconversion in both cases.
EDIT:
There is no argument that x, y undergo integer promotion and then need to be converted to the result type, so no need to explain that.
The only question here is why does GCC behave differently for different types. The answer will involve some insight on GCC's internals.
TLDR
using information only about the types involved, gcc
should warn for both cases because of conversion from int
(larger type) to char
/unsigned char
(smaller types)
Using also information about the possible values (range analysis) gcc
should warn for none because the result of x % y
, even after promotions to int, will always fit back to the same type as x
and y
.
So it seems that in the first case gcc
can assert that the operations will never result in a value change, but for some reason cannot do that for the second case.
As a side note, clang does not warn for any.
Type system
On the tested system (x86-64) the
char
type is signed. Please be aware that it still a different type thansigned char
.x % y
Due to integer promotion rules, in both cases,x
andy
are promoted toint
. The resultx % y
is of typeint
.If we make all the implicit conversions explicit then we get this:
unsigned char foo1(unsigned char x, unsigned char y) { return (unsigned char)((int) x % (int) y); } char foo2(char x, char y) { return (char)((int) x % (int) y); }
Implicit conversion from
int
tochar
,unsigned char
and tosigned char
fires the warning with-Wconversion
:-Wconversion
Warn for implicit conversions that may alter a value. This includes [..] and conversions to smaller types
Indeed both these functions result in a warning getting generated:
char bar1(int a) { return a; // warning: conversion from 'int' to 'char' may change value [-Wconversion] } unsigned char bar2(int a) { return a; // warning: conversion from 'int' to 'unsigned char' may change value [-Wconversion] }
So using type information only we should get a warning for both because our 2 functions have an implicit conversion from int
to char
/unsigned char
just like bar1
and bar2
.
Value analysis
If we use the notation r = x % y
then r
has the same sign as x
and |r| ∈ [0, |y|)
.
if
x
andy
are of typeunsigned char
thenr ∈ [0, CHAR_MAX)
.r
fits in anunsigned char
. So no warning needed.if
x
andy
are of typechar
:CHAR_MIN = -CHAR_MAX - 1
max(|y|) = CHAR_MAX + 1
|r| ∈ [0, max(|y|))
|r| ∈ [0, CHAR_MAX + 1)
r ∈ (-CHAR_MAX - 1, CHAR_MAX + 1)
r
fits in achar
so no warning needed.
So what I am arguing is that the result of x % y
always fits in an U8
even after all the integer promotions and implicit conversions.
You can have a look at this godbolt
这篇关于为什么GCC的-Wconversion对于char和unsigned char的行为有所不同?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!