在C/C ++中进行无符号左移之前的掩盖是否过于偏执? [英] Is masking before unsigned left shift in C/C++ too paranoid?
问题描述
这个问题是由于我在C/C ++中实现密码算法(例如SHA-1),编写与平台无关的可移植代码并完全避免
This question is motivated by me implementing cryptographic algorithms (e.g. SHA-1) in C/C++, writing portable platform-agnostic code, and thoroughly avoiding undefined behavior.
假设标准的加密算法要求您实施此操作:
Suppose that a standardized crypto algorithm asks you to implement this:
b = (a << 31) & 0xFFFFFFFF
其中,a
和b
是无符号的32位整数.请注意,在结果中,我们丢弃了最低有效32位以上的所有位.
where a
and b
are unsigned 32-bit integers. Notice that in the result, we discard any bits above the least significant 32 bits.
作为第一个幼稚的近似值,我们可以假设int
在大多数平台上为32位宽,因此我们可以这样写:
As a first naive approximation, we might assume that int
is 32 bits wide on most platforms, so we would write:
unsigned int a = (...);
unsigned int b = a << 31;
我们知道此代码无法在所有地方使用,因为int
在某些系统上为16位宽,在其他系统上为64位,甚至可能为36位.但是使用stdint.h
,我们可以使用uint32_t
类型改进此代码:
We know this code won't work everywhere because int
is 16 bits wide on some systems, 64 bits on others, and possibly even 36 bits. But using stdint.h
, we can improve this code with the uint32_t
type:
uint32_t a = (...);
uint32_t b = a << 31;
这样我们就完成了,对吧?这就是我多年以来的想法. ... 不完全的.假设在某个平台上,我们有:
So we are done, right? That's what I thought for years. ... Not quite. Suppose that on a certain platform, we have:
// stdint.h
typedef unsigned short uint32_t;
在C/C ++中执行算术运算的规则是,如果类型(例如short
)比int
窄,则如果所有值都适合,则将其扩展为int
或unsigned int
否则.
The rule for performing arithmetic operations in C/C++ is that if the type (such as short
) is narrower than int
, then it gets widened to int
if all values can fit, or unsigned int
otherwise.
假设编译器将short
定义为32位(有符号),而将int
定义为48位(有符号).然后这些代码行:
Let's say that the compiler defines short
as 32 bits (signed) and int
as 48 bits (signed). Then these lines of code:
uint32_t a = (...);
uint32_t b = a << 31;
将有效地表示:
unsigned short a = (...);
unsigned short b = (unsigned short)((int)a << 31);
请注意,由于所有ushort
(即uint32
)都适合int
(即int48
),因此a
被提升为int
.
Note that a
is promoted to int
because all of ushort
(i.e. uint32
) fits into int
(i.e. int48
).
但是现在我们遇到了一个问题:将非零位左移到带符号整数类型的符号位中是未定义的行为.之所以发生此问题,是因为我们的uint32
被提升为int48
-而不是被提升为uint48
(可以向左移动).
But now we have a problem: shifting non-zero bits left into the sign bit of a signed integer type is undefined behavior. This problem happened because our uint32
was promoted to int48
- instead of being promoted to uint48
(where left-shifting would be okay).
这是我的问题:
-
我的推理正确吗,这在理论上是合理的问题吗?
Is my reasoning correct, and is this a legitimate problem in theory?
这个问题可以忽略吗,因为在每个平台上,下一个整数类型都是宽度的两倍?
Is this problem safe to ignore because on every platform the next integer type is double the width?
通过像这样预先掩盖输入是否正确抵御这种病理情况是一个好主意?:b = (a & 1) << 31;
. (这在每个平台上都一定是正确的.但是,这可能会使对速度要求严格的加密算法的速度比必要的慢.)
Is a good idea to correctly defend against this pathological situation by pre-masking the input like this?: b = (a & 1) << 31;
. (This will necessarily be correct on every platform. But this could make a speed-critical crypto algorithm slower than necessary.)
说明/修改:
-
我将接受C或C ++或两者的答案.我想知道至少一种语言的答案.
I'll accept answers for C or C++ or both. I want to know the answer for at least one of the languages.
预屏蔽逻辑可能会损害位旋转.例如,GCC会将b = (a << 31) | (a >> 1);
编译为汇编语言中的32位位旋转指令.但是,如果我们预先掩盖了左移位,则可能是新逻辑未转换为位旋转,这意味着现在执行的是4个操作,而不是1.
The pre-masking logic may hurt bit rotation. For example, GCC will compile b = (a << 31) | (a >> 1);
to a 32-bit bit-rotation instruction in assembly language. But if we pre-mask the left shift, it is possible that the new logic is not translated into bit rotation, which means now 4 operations are performed instead of 1.
推荐答案
Taking a clue from this question about possible UB in uint32 * uint32
arithmetic, the following simple approach should work in C and C++:
uint32_t a = (...);
uint32_t b = (uint32_t)((a + 0u) << 31);
整数常量0u
的类型为unsigned int
.这样可以将a + 0u
添加到uint32_t
或unsigned int
中的较大者.由于类型的等级为int
或更高,因此不会再出现提升,并且可以在左操作数为uint32_t
或unsigned int
的情况下应用移位.
The integer constant 0u
has type unsigned int
. This promotes the addition a + 0u
to uint32_t
or unsigned int
, whichever is wider. Because the type has rank int
or higher, no more promotion occurs, and the shift can be applied with the left operand being uint32_t
or unsigned int
.
最后转换回uint32_t
只会抑制有关转换变窄的潜在警告(例如,如果int
是64位).
The final cast back to uint32_t
will just suppress potential warnings about a narrowing conversion (say if int
is 64 bits).
一个体面的C编译器应该能够看到加零是无操作,这比看到无符号移位后的预屏蔽没有效果要轻.
A decent C compiler should be able to see that adding zero is a no-op, which is less onerous than seeing that a pre-mask has no effect after an unsigned shift.
这篇关于在C/C ++中进行无符号左移之前的掩盖是否过于偏执?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!