其中2的补码整数操作可以在不输入调零高比特被使用,如果只结果的低部被通缉? [英] Which 2's complement integer operations can be used without zeroing high bits in the inputs, if only the low part of the result is wanted?

查看:253
本文介绍了其中2的补码整数操作可以在不输入调零高比特被使用,如果只结果的低部被通缉?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

在汇编语言编程,这是相当常见的要计算从不能保证有其他位清零寄存器的低位东西。在更高层次的语言,如C,你会简单地把你的投入规模小,让编译器决定是否需要单独零每个输入的高位,或者是否可以砍下结果后高位事实。

In assembly programming, it's fairly common to want to compute something from the low bits of a register that isn't guaranteed to have the other bits zeroed. In higher level languages like C, you'd simply cast your inputs to the small size and let the compiler decide whether it needs to zero the upper bits of each input separately, or whether it can chop off the upper bits of the result after the fact.

这是X86-64(又名AMD64)尤其常见,因为种种原因 1 ,其中有些是在其他国际检索单位present。

This is especially common for x86-64 (aka AMD64), for various reasons1, some of which are present in other ISAs.

我将使用实例的x86 64位,但其目的是询问/商量 2的补并无符号的二进制算术一般,因为<一href=\"http://stackoverflow.com/questions/2931630/how-are-negative-numbers-re$p$psented-in-32-bit-signed-integer\">all现代的CPU使用它。 (请注意,C和C ++不保证补 4 ,并签署溢出是未定义的行为。)

I'll use 64bit x86 for examples, but the intent is to ask about/discuss 2's complement and unsigned binary arithmetic in general, since all modern CPUs use it. (Note that C and C++ don't guarantee two's complement4, and that signed overflow is undefined behaviour.)

作为一个例子,考虑一个简单的功能,可以编写一个 LEA 指令 2 。 (在X86-64 SysV的(Linux)的 ABI 3 ,前两个功能ARG游戏在 RDI RSI ,并在返回 RAX INT 是一个32位的类型。)

As an example, consider a simple function that can compile to an LEA instruction2. (In the x86-64 SysV(Linux) ABI3, the first two function args are in rdi and rsi, with the return in rax. int is a 32bit type.)

; int intfunc(int a, int b) { return a + b*4 + 3; }
intfunc:
    lea  eax,  [edi + esi*4 + 3]  ; the obvious choice, but gcc can do better
    ret

的gcc知道此外,即使负符号的整数,从右边仅携带到左,所以输入的高位比特可以在不影响什么进入 eax中。因此,这样可以节省一个指令字节,并使用 LEA EAX,[RDI + RSI * 4 + 3]

和为什么它的工作原理?

And why does it work?

1 为什么这个频繁出现的 X86-64
 X86-64具有可变长度的指令,其中,额外的preFIX字节改变操作数大小(从32到64或16),因此节省了一个字节,通常可以在被以相同的速度,否则执行的指令。它也有假的依赖关系(AMD / P4 / Silvermont)写入寄存器的低位8B或16B(或当读书迟全寄存器(英特尔pre-IVB)一档)时:由于历史原因,<一个href=\"http://stackoverflow.com/questions/11177137/why-do-most-x64-instructions-zero-the-upper-part-of-a-32-bit-register\">only写到32B子登记零的其余64B注册。几乎所有的算术和逻辑可以在低8,16或32位,以及完整的64位的通用寄存器,用于上。整数矢量指令也相当不正交,与某些操作没有可用的一些元件的尺寸。

1 Why this comes up frequently for x86-64: x86-64 has variable-length instructions, where an extra prefix byte changes the operand size (from 32 to 64 or 16), so saving a byte is often possible in instructions that are otherwise executed at the same speed. It also has false-dependencies (AMD/P4/Silvermont) when writing the low 8b or 16b of a register (or a stall when later reading the full register (Intel pre-IvB)): For historical reasons, only writes to 32b sub-registers zero the rest of the 64b register. Almost all arithmetic and logic can be used on on the low 8, 16, or 32bits, as well as the full 64bits, of general-purpose registers. Integer vector instructions are also rather non-orthogonal, with some operations not available for some element sizes.

此外,不同于X86-32,ABI的传递函数参数的寄存器,并且是窄类型零高位比特是不需要的。

Furthermore, unlike x86-32, the ABI passes function args in registers, and upper bits aren't required to be zero for narrow types.

2 LEA:如同其他指令,默认的操作 LEA的大小是32位,但默认地址长度是64位。一个操作数大小preFIX字节( 0x66 REX.W )可以使输出操作数大小16或64位。地址大小preFIX字节( 0x67 ),可以减少地址的大小32位(64位中的模式)或16位(32位中的模式)。因此,在64位模式, LEA EAX,[EDX + ESI] 需要一个字节比 LEA EAX多,[RDX + RSI]

2 LEA: Like other instructions, the default operand size of LEA is 32bit, but the default address size is 64bit. An operand-size prefix byte (0x66 or REX.W) can make the output operand size 16 or 64bit. An address-size prefix byte (0x67) can reduce the address size to 32bit (in 64bit mode) or 16bit (in 32bit mode). So in 64bit mode, lea eax, [edx+esi] takes one byte more than lea eax, [rdx+rsi].

这是可以做到 LEA RAX,[EDX + ESI] ,但地址仍只有32位计算(进不设置位32 RAX )。你与 LEA EAX相同的结果,[RDX + RSI] ,这是两个字节的短。因此,地址大小preFIX与 LEA 从来没有很有用,因为从瓦格纳雾的优秀objconv反汇编拆装输出的注释警告。

It is possible to do lea rax, [edx+esi], but the address is still only computed with 32bits (a carry doesn't set bit 32 of rax). You get identical results with lea eax, [rdx+rsi], which is two bytes shorter. Thus, the address-size prefix is never useful with LEA, as the comments in disassembly output from Agner Fog's excellent objconv disassembler warn.

3 86 ABI
主叫方的的具有零(或符号扩展)的用于传递或返回值类型小寄存器64的上部。那想用返回值作为数组索引调用者就必须符号扩展它(与 MOVZX RAX,EAX ,或特殊情况下换EAX指令 cdqe 。(不要与混淆干熄焦,其中信号扩展 EAX EDX:EAX 如设置为 IDIV ))

3 x86 ABI: The caller doesn't have to zero (or sign-extend) the upper part of 64bit registers used to pass or return smaller types by value. A caller that wanted to use the return value as an array index would have to sign-extend it (with movzx rax, eax, or the special-case-for-eax instruction cdqe. (not to be confused with cdq, which sign-extends eax into edx:eax e.g. to set up for idiv.))

这意味着函数返回 unsigned int类型可以计算在 RAX 临时64位的返回值,而不是需要 MOV EAX,EAX to零高位 RAX 。这样的设计决定适用于大多数情况:经常调用者不需要任何额外指令的上半部 RAX 忽略未定义位

This means a function returning unsigned int can compute its return value in a 64bit temporary in rax, and not require a mov eax, eax to zero the upper bits of rax. This design decision works well in most cases: often the caller doesn't need any extra instructions to ignore the undefined bits in the upper half of rax.

C和C ++做专的的要求的二进制补码有符号整数(除了的 C ++ 的std ::原子 的类型)。 一个人的补充和符号/幅度也允许,所以的完全的便携C,这些技巧都是唯一有用的用无符号类型。显然,对于签名的操作,在符号/幅值重新presentation一套符号位装置的其它位中减去,而不是增加,例如。我没有通过逻辑工作的补

C and C++ specifically do not require two's complement binary signed integers (except for C++ std::atomic types). One's complement and sign/magnitude are also allowed, so for fully portable C, these tricks are only useful with unsigned types. Obviously for signed operations, a set sign-bit in sign/magnitude representation means the other bits are subtracted, rather than added, for example. I haven't worked through the logic for one's complement

不过,位黑客的说的仅与二的补的是工作 wides $ p $垫时,因为在实际上没有人关心别的。许多与补工作的事情也应该与一种补工作,因为符号位仍然不改变其他位的帧间pretation:它只是具有值 - 2( N -1)(而不是2 N )。符号/幅度再presentation没有这个属性:每一位的位值是依赖于符号位正或负

However, bit-hacks that only work with two's complement are widespread, because in practice nobody cares about anything else. Many things that work with two's complement should also work with one's complement, since the sign bit still doesn't change the interpretation of the other bits: it just has a value of -(2N-1) (instead of 2N). Sign/magnitude representation does not have this property: the place value of every bit is positive or negative depending on the sign bit.

另外请注意,C编译器允许承担签署溢出的从未发生过的,因为它是不确定的行为。因此,如编译器可以做假设(X + 1) - ; X 始终为false 。这使得检测签订溢出C. 注意,而不方便的<无符号环绕(进),并签署溢出的区别/ A>

Also note that C compilers are allowed to assume that signed overflow never happens, because it's undefined behaviour. So e.g. compilers can and do assume (x+1) < x is always false. This makes detecting signed overflow rather inconvenient in C. Note that the difference between unsigned wraparound (carry) and signed overflow.

推荐答案

tl;dr summary:

Wide operations that can be used with garbage in upper bits:

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆