uint64_t的未定义高阶,同时移位和屏蔽32位值 [英] Undefined high-order of uint64_t while shifting and masking 32-bit values

查看:152
本文介绍了uint64_t的未定义高阶,同时移位和屏蔽32位值的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在一个看似无害的函数中有一些未定义的行为,该函数正在从缓冲区中解析 double 值。我读了两半的 double ,因为我可以肯定地确定语言标准说,将 char 值移位是有效的在32位环境中。

I have some undefined behaviour in a seemingly innocuous function which is parsing a double value from a buffer. I read the double in two halves, because I am reasonably certain the language standard says that shifting char values is only valid in a 32-bit context.

inline double ReadLittleEndianDouble( const unsigned char *buf )
{
    uint64_t lo = (buf[3] << 24) | (buf[2] << 16) | (buf[1] << 8) | buf[0];
    uint64_t hi = (buf[7] << 24) | (buf[6] << 16) | (buf[5] << 8) | buf[4];
    uint64_t val = (hi << 32) | lo;
    return *(double*)&val;
}

因为我将32位值存储到64位变量<$ c中$ c> lo 和 hi ,我合理地期望这些变量的高阶32位将始终为 0x00000000 。但有时它们包含 0xffffffff 或其他非零垃圾。

Since I am storing 32-bit values into 64-bit variables lo and hi, I reasonably expect that the high-order 32-bits of these variables will always be 0x00000000. But sometimes they contain 0xffffffff or other non-zero rubbish.

解决方法是这样屏蔽它:

The fix is to mask it like this:

uint64_t val = ((hi & 0xffffffffULL) << 32) | (lo & 0xffffffffULL);

或者,如果我在作业期间遮罩,似乎似乎可以工作

Alternatively, it seems to work if I mask during the assignment instead:

uint64_t lo = ((buf[3] << 24) | (buf[2] << 16) | (buf[1] << 8) | buf[0]) & 0xffffffff;
uint64_t hi = ((buf[7] << 24) | (buf[6] << 16) | (buf[5] << 8) | buf[4]) & 0xffffffff;

我想知道为什么这样做是必要的。我能想到的是,我的编译器直接对 lo hi 进行了所有移位和合并在64位寄存器上,如果是这种情况,我可能会期望在高阶32位中发生不确定的行为。

I would like to know why this is necessary. All I can think of to explain this is that my compiler is doing all the shifting and combining for lo and hi directly on 64-bit registers, and I might expect undefined behaviour in the high-order 32-bits if this is the case.

有人可以确认我的怀疑或以其他方式解释

Can someone please confirm my suspicions or otherwise explain what is happening here, and comment on which (if any) of my two solutions is preferable?

推荐答案

如果发生这种情况,请问一下我的两个解决方案中的哪个(如果有)是更可取的? char unsigned char 会让您受制于标准整数促销活动的摆布。您最好自己铸造值,在尝试迁移它们之前

If you try to shift a char or unsigned char you're leaving yourself at the mercy of the standard integer promotions. You're better off casting the values yourself, before you try to shift them. You don't have to separate the lower and upper halves if you do so.

inline double ReadLittleEndianDouble( const unsigned char *buf )
{
    uint64_t val = ((uint64_t)buf[7] << 56) | ((uint64_t)buf[6] << 48) | ((uint64_t)buf[5] << 40) | ((uint64_t)buf[4] << 32) |
                   ((uint64_t)buf[3] << 24) | ((uint64_t)buf[2] << 16) | ((uint64_t)buf[1] << 8) | (uint64_t)buf[0];
    return *(double*)&val;
}

仅当CPU是big-endian或缓冲区时,这才是必需的可能不适合CPU体系结构,否则可以大大简化此操作:

All this is necessary only if the CPU is big-endian or if the buffer might not be properly aligned for the CPU architecture, otherwise you can simplify this greatly:

    return *(double*)buf;

这篇关于uint64_t的未定义高阶,同时移位和屏蔽32位值的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆