在unsigned char到整数类型转换的后台发生了什么? [英] What's happening in the background of a unsigned char to integer type cast?

查看:227
本文介绍了在unsigned char到整数类型转换的后台发生了什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我今天从一个开关块中得到一些奇怪的行为,特别是我从一个文件读取一个字节,并将其与某些十六进制值进行比较(文本文件编码问题,没有什么大不了)。代码看起来像:

  char BOM [3] = {0}; 
b_error = ReadFile(iNCfile,BOM,3,& lpNumberOfBytesRead,NULL);

switch(BOM [0]){
case 0xef:{
//字节顺序标记可能表示UTF-8
if((BOM [1] = 0xBB)&&(BOM [2] == 0xBF)){
iNCfileEncoding = UTF8;
}
break;
}
}

。我意识到,交换机将值提升为整数,一旦点击到位,我能够匹配使用0xffffffef在case语句。当然正确的解决方案是使BOM []无符号,现在一切都按预期促进和比较。



有人可以简要解释char - > int促销产生0xffffffef而不是0x000000ef?

解决方案


有人可以简要解释什么是
在char中 - > int promotion
,产生0xffffffef而不是
0x000000ef?




相反,你有一个负数 char 值,因为开关条件被提升为相同的负值 int

b。使用您的32位C ++编译器 0xffffffef >被解释为 unsigned int 文字,因为它对于32位 int 来说太大了, / p>


C ++ 98 2.13.1 / 2

如果是八进制或十六进制没有后缀,它在
中有第一个类型,它可以表示: int unsigned int long int unsigned long int


现在,对于案例标签,


C ++ 98§6.4.2/ 2

积分常数表达式(5.19)隐式转换为提升的转换条件的
类型。


在您的情况下,对于签名的目标类型,转换的结果是由


C ++ 98§4.7/ 3

如果目标类型是带符号的,它可以表示为
在目标类型(和位字段宽度);


但在实践中,几乎所有的编译器都使用二进制补码表示而没有陷阱,因此实现定义的转换是在你的情况下bitpattern 0xffffffef 被解释为负值的二进制补码规范。您可以通过0xffffffef - 2 32 计算哪个值,因为我们在这里说32位表示。或者,由于这只是一个8位的值,已被符号扩展为32位,因此您也可以计算为0xef - 2 8 ,其中0xef是字符代码点。



hth。,


I was getting some odd behaviour out of a switch block today, specifically I was reading a byte from a file and comparing it against certain hex values (text file encoding issue, no big deal). The code looked something like:

char BOM[3] = {0};
b_error = ReadFile (iNCfile, BOM, 3, &lpNumberOfBytesRead, NULL); 

switch ( BOM[0] ) {
case 0xef: {
    // Byte Order Marker Potentially Indicates UTF-8
    if ( ( BOM[1] == 0xBB ) && ( BOM[2] == 0xBF ) ) {
        iNCfileEncoding = UTF8;
    }
    break;
           }
}

Which didn't work, although the debug looked ok. I realized that the switch was promoting the values to integers, and once that clicked in place I was able to match using 0xffffffef in the case statement. Of course the correct solution was to make BOM[] unsigned and now everything promotes and compares as expected.

Can someone briefly explain what was going on in the char -> int promotion that produced 0xffffffef instead of 0x000000ef?

解决方案

"Can someone briefly explain what was going on in the char -> int promotion that produced 0xffffffef instead of 0x000000ef?"

Contrary to the four answers so far, it didn't.

Rather, you had a negative char value, which as a switch condition was promoted to the same negative int value as required by

C++98 §6.4.2/2
Integral promotions are performed.

Then with your 32-bit C++ compiler 0xffffffef was interpreted as an unsigned int literal, because it’s too large for a 32-bit int, by

C++98 2.13.1/2
If it is octal or hexadecimal and has no suffix, it has the first of these types in which it can be represented: int, unsigned int, long int, unsigned long int.

Now, for the case label,

C++98 §6.4.2/2
The integral constant-expression (5.19) is implicitly converted to the promoted type of the switch condition.

In your case, with signed destination type, the result of the conversion is formally implementation-defined, by

C++98 §4.7/3
If the destination type is signed, the value is unchanged if it can be represented in the destination type (and bit-field width); otherwise, the value is implementation-defined.

But in practice nearly all compilers use two's complement representation with no trapping, and so the implementation defined conversion is in your case that the bitpattern 0xffffffef is interpreted as two's complement specification of a negative value. You can calculate which value by 0xffffffef - 232, because we’re talking 32-bit representation here. Or, since this is just an 8-bit value that’s been sign extended to 32 bits, you can alternatively calculate it as 0xef - 28, where 0xef is the character code point.

Cheers & hth.,

这篇关于在unsigned char到整数类型转换的后台发生了什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆