类型转换后的不良价值影响 [英] Bad value affectation after type casting

查看:21
本文介绍了类型转换后的不良价值影响的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我使用本机无符号长型变量作为缓冲区,用于在其中包含两个无符号短型变量.根据我对 C++ 的了解,它应该是一种有效的方法.我使用这种方法多次将 2 个 unsigned char 存储在一个 unsigned short 中,没有任何问题.不幸的是,当在不同的架构上使用它时,它的反应很奇怪.似乎在第二次分配后更新值.(Overflow) 案例只是为了演示它.谁能解释一下为什么它会有这种反应?

I am using a native unsigned long variable as a buffer used to contain two unsigned short variable inside it. From my knowledge of C++ it should be a valid method. I used this method to store 2 unsigned char inside one unsigned short many times without any problem. Unfortunately when using it on a different architecture, it react strangely. It seems to update the value after a second assignation. The (Overflow) case is there simply to demonstrate it. Can anyone shed some light on why it react that way?

unsigned long dwTest = 0xFFEEDDCC;

printf("sizeof(unsigned short) = %d\n", sizeof(unsigned short));
printf("dwTest = %08X\n", dwTest);

//Address + values
printf("Addresses + Values: %08X <- %08X, %08X <- %08X\n", (DWORD)(&((unsigned short*)&dwTest)[0]), (((unsigned short*)&dwTest)[0]), (DWORD)(&((unsigned short*)&dwTest)[1]), (((unsigned short*)&dwTest)[1]) );

((unsigned short*)&dwTest)[0] = (WORD)0xAAAA;
printf("dwTest = %08X\n", dwTest);

((unsigned short*)&dwTest)[1] = (WORD)0xBBBB;
printf("dwTest = %08X\n", dwTest);

//(Overflow)
((unsigned short*)&dwTest)[2] = (WORD)0x9999;

printf("dwTest = %08X\n", dwTest);

Visual C++ 2010 输出(正常):

Visual C++ 2010 output (OK):

sizeof(unsigned short) = 2
dwTest = FFEEDDCC
Addresses + Values: 0031F728 <- 0000DDCC, 0031F72A <- 0000FFEE

dwTest = FFEEAAAA

dwTest = BBBBAAAA

dwTest = BBBBAAAA

ARM9 GCC Crosstool 输出(不起作用):

ARM9 GCC Crosstool output (Doesn't work):

sizeof(unsigned short) = 2
dwTest = FFEEDDCC
Addresses + Values: 7FAFECD8 <- 0000DDCC, 7FAFECDA <- 0000FFEE

dwTest = FFEEDDCC

dwTest = FFEEAAAA

dwTest = BBBBAAAA

推荐答案

您尝试做的事情称为类型双关.有两种传统的方法可以做到这一点.

What you are trying to do is called type-punning. There are two traditional ways to do it.

一种方法是通过指针(你做了什么).不幸的是,这与优化器冲突.你看,由于停机问题,优化器在一般情况下无法知道两个指针互不别名.这意味着编译器必须重新加载任何可能已通过指针修改的值,从而导致大量潜在的不必要的重新加载.

A way to do it is via pointers (what you have done). Unfortunately, this conflicts with the optimizer. You see, due to the halting problem, the optimizer cannot know in the general case that two pointers don't alias each other. This means that the compiler has to reload any value that may have been modified via a pointer, resulting in tons of potentially unnecessary reloads.

因此,引入了严格别名规则.它基本上是说两个指针只能在它们属于同一类型时互为别名.作为一个特殊规则,char * 可以为任何其他指针设置别名(但反过来不行).这通过指针打破了类型限制,让编译器生成更高效的代码.当 gcc 检测到类型双关并启用警告时,它会警告您:

So, the strict-aliasing rule was introduced. It basically says that two pointers can only alias each other when they are of the same type. As a special rule, a char * can alias any other pointer (but not the other way around). This breaks type-punning via pointers, and lets the compiler generate more efficient code. When gcc detects type-punning and has warnings enabled, it will warn you thus:

warning: dereferencing type-punned pointer will break strict-aliasing rules

进行类型双关的另一种方法是通过联合:

Another way to do type-punning is via the union:

union {
    int i;
    short s[2];
} u;
u.i = 0xDEADBEEF;
u.s[0] = 0xBABE;
....

这会打开一整罐蠕虫.在最好的情况下,这取决于实现.现在,我无权访问 C89 标准,但在 C99 中,它最初声明除了最后一个存储的联合成员的值是未指定的.这在 TC 中已更改,以说明与最后一个存储到成员不对应的字节值未指定,并另外说明与最后一个存储到成员对应的字节将根据新的类型(显然依赖于实现的东西).

This opens up a new whole can of worms. In the best case, this is implementation dependant. Now, I don't have access to the C89 standard, but in C99 it originally stated that the value of an union member other than the last one stored into is unspecified. This was changed in a TC to state that the values of bytes that don't correspond to the last stored-into member are unspecified, and stated otherwise that the bytes that do correspond to the last stored-into member are reinterpreted as per the new type (something which is obviously implementation dependant).

对于 C++,我在标准中找不到关于 union hack 的语言.无论如何,C++ 有 reinterpret_cast<>,这是你应该在 C++ 中用于类型双关的东西(使用 reinterpret_cast<> 的参考变体).

For C++, I can't find the language about the union hack in the standard. Anyways, C++ has reinterpret_cast<>, which is what you should use for type-punning in C++ (use the reference variant of reinterpret_cast<>).

无论如何,您可能不应该使用类型双关(取决于实现),您应该通过位移位手动构建您的值.

Anyways, you probably shouldn't be using type-punning (implementation-dependant), and you should build up your values manually via bit-shifting.

这篇关于类型转换后的不良价值影响的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆