如何否定存储在 32 位寄存器对中的 64 位整数? [英] How do I negate a 64-bit integer stored in a 32-bit register pair?
问题描述
我在EDX:EAX
寄存器对中存储了一个64位整数.我怎样才能正确否定这个数字?
I've stored a 64-bit integer in the EDX:EAX
register pair.
How can I correctly negate the number?
例如:123456789123
→-123456789123
.
For example: 123456789123
→ -123456789123
.
推荐答案
向编译器征求意见:compile int64_t neg(int64_t a) { return -a;}
在 32 位模式下.当然,询问编译器的不同方式将在内存中、在编译器选择的寄存器中或已经在 EDX:EAX 中具有起始值.查看所有三种方式 在 Godbolt 编译器资源管理器中,带有来自 gcc、clang 和 MSVC(又名 CL)的 asm 输出.
Ask a compiler for ideas: compile int64_t neg(int64_t a) { return -a; }
in 32-bit mode. Of course, different ways of asking the compiler will have the starting value in memory, in the compiler's choice of registers, or already in EDX:EAX. See all three ways on the Godbolt compiler explorer, with asm output from gcc, clang, and MSVC (aka CL).
当然有很多方法可以实现这一点,但任何可能的序列都需要某种从低到高的进位,因此没有有效的方法来避免 SBB 或 ADC.
There are of course lots of ways to accomplish this, but any possible sequence will need some kind of carry from low to high at some point, so there's no efficient way to avoid SBB or ADC.
如果值在内存中开始,或者您想将原始值保留在寄存器中,请对目标进行异或零处理并使用 SUB/SBB.SysV x86-32 ABI 在堆栈上传递参数并在 EDX:EAX 中返回 64 位整数.这就是 clang3.9.1 -m32 -O3
确实,对于 neg_value_from_mem
:
If the value starts in memory, or you want to keep the original value in registers, xor-zero the destination and use SUB/SBB. The SysV x86-32 ABI passes args on the stack and returns 64-bit integers in EDX:EAX. This is what clang3.9.1 -m32 -O3
does, for neg_value_from_mem
:
; optimal for data coming from memory: just subtract from zero
xor eax, eax
xor edx, edx
sub eax, dword ptr [esp + 4]
sbb edx, dword ptr [esp + 8]
如果您在寄存器中有值并且不需要就地结果,您可以使用 NEG 将寄存器设置为 0 - 本身,如果输入非零则设置 CF.即与 SUB 相同的方式.请注意,异或归零很便宜,并且不是延迟关键路径的一部分,因此这绝对比 gcc 的 3-指令序列(下).
If you have the values in registers and don't need the result in-place, you can use NEG to set a register to 0 - itself, setting CF iff the input is non-zero. i.e. the same way SUB would. Note that xor-zeroing is cheap, and not part of the latency critical path, so this is definitely better than gcc's 3-instruction sequence (below).
;; partially in-place: input in ecx:eax
xor edx, edx
neg eax ; eax = 0-eax, setting flags appropriately
sbb edx, ecx ;; result in edx:eax
Clang 甚至在就地情况下也会这样做,即使这会花费额外的 mov ecx,edx
.这对于具有零延迟 mov reg,reg(Intel IvB+ 和 AMD Zen)的现代 CPU 的延迟是最佳的,但不适用于融合域 uops 的数量(前端吞吐量)或代码大小.
Clang does this even for the in-place case, even though that costs an extra mov ecx,edx
. That's optimal for latency on modern CPUs that have zero-latency mov reg,reg (Intel IvB+ and AMD Zen), but not for number of fused-domain uops (frontend throughput) or code-size.
gcc 的序列很有趣,而且不是很明显.它为就地情况节省了一条指令 vs. clang,但否则情况会更糟.
gcc's sequence is interesting and not totally obvious. It saves an instruction vs. clang for the in-place case, but it's worse otherwise.
; gcc's in-place sequence, only good for in-place use
neg eax
adc edx, 0
neg edx
; disadvantage: higher latency for the upper half than subtract-from-zero
; advantage: result in edx:eax with no extra registers used
不幸的是,gcc 和 MSVC 都总是使用这个,即使 xor-zero + sub/sbb 会更好.
Unfortunately, gcc and MSVC both always use this, even when xor-zero + sub/sbb would be better.
要更全面地了解编译器的作用,请查看这些函数的输出(在 Godbolt 上)
For a more complete picture of what compilers do, have a look at their output for these functions (on godbolt)
#include <stdint.h>
int64_t neg_value_from_mem(int64_t a) {
return -a;
}
int64_t neg_value_in_regs(int64_t a) {
// The OR makes the compiler load+OR first
// but it can choose regs to set up for the negate
int64_t reg = a | 0x1111111111LL;
// clang chooses mov reg,mem / or reg,imm8 when possible,
// otherwise mov reg,imm32 / or reg,mem. Nice :)
return -reg;
}
int64_t foo();
int64_t neg_value_in_place(int64_t a) {
// foo's return value will be in edx:eax
return -foo();
}
这篇关于如何否定存储在 32 位寄存器对中的 64 位整数?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!