x86汇编16位vs 8位立即操作数编码 [英] x86 assembly 16 bit vs 8 bit immediate operand encoding
问题描述
我正在编写自己的汇编器并尝试对ADC指令进行编码,我对立即数有疑问,尤其是在将8位值添加到AX寄存器中时.
当添加16位值时:adc ax, 0xff33
被编码为15 33 ff
,这是正确的.
但是adc ax, 0x33
是否被编码为15 33 00
会很重要吗?
Nasm将此编码为83 d0 33
,这显然是正确的,但是我的方法也正确吗?
x86通常具有一种以上的有效指令编码方式.例如大多数op reg, reg
指令可以选择通过op r/m, reg
或op reg, r/m
操作码进行编码.
是的,通常,您希望汇编程序始终选择指令的最短编码. NASM甚至将x86-64的mov rax, 1
(对于mov r64, sign_extended_imm32
为7个字节)优化为mov eax, 1
(为5个字节),更改了操作数大小以使用零扩展来写入32位寄存器,而不是显式符号.扩展为32位立即数.
在可用时使用sign-extended-imm8编码总是很好的
对于16位来说,它的长度是相等的,但是对于32位操作数大小,它的长度要短一些,因此它简化了代码,始终选择imm8
.
操作数大小为32位时,op eax, imm32
是5个字节,而op r/m32, imm8
仍然是3个字节. (不计算设置操作数大小或其他内容所需的任何前缀;这两个前缀将是相同的.)
imm8编码的性能优势
如果需要操作数大小的前缀(例如,在adc ax, 0x33
的32位模式下),则将adc ax/eax/rax, imm16/32/32
编码与操作数大小的前缀一起使用将在Intel CPU上创建 LCP停顿 >(更改长度的前缀表示前缀会更改指令的 rest 的长度.对于imm8编码,不会发生这种情况,因为它仍然是(前缀)+操作码+ modrm + imm8,无论操作数大小.
请参见adc
是特例之外.
在 如果您要设计一种新的asm语法,则可以考虑允许使用override关键字对编码进行更多控制.对于现有设计,请查看NASM的 GNU汇编程序x86指令的后缀如".s"如何?在"mov.s"中(GAS 签名或MOV moffs32在64位模式下地址的零扩展吗?在64位模式下,具有无现代 I'm writing my own assembler and trying to encode the ADC instruction, I have a question about immediate values, especially when adding 8-bit value into the AX register. When adding 16-bit value: Nasm encodes this into It's common for x86 to have more than 1 valid way of encoding an instruction. e.g. most And yes, normally you want an assembler to always pick the shortest encoding for an instruction. NASM even optimizes It's equal length for 16-bit, but shorter for 32-bit operand-size, so it simplifies your code to always choose With operand-size of 32-bit, If an operand-size prefix is requires (e.g. in 32-bit mode for See Agner Fog's microarch.pdf and other performance links in the x86 tag wiki. See also x86 instruction encoding how to choose opcode which is a duplicate of this, except for the fact that In the specific case of But this special casing doesn't work for the no-ModRM short form encodings, so the 3-byte So always choosing the sign-extended-imm8 whenever possible is optimal for this, too, even in 16-bit mode where no operand-size prefix would be required for Most assemblers don't provide an override to avoid the no-ModRM short form. When they were designed, there wasn't a performance use-case other than intentionally lengthening instructions to get alignment without adding NOPs before the top of a loop or other branch target: What methods can be used to efficiently extend instruction length on modern x86? If you're designing a new flavour of asm syntax you might consider allowing more control of the encoding with override keywords. For existing designs, check out NASM's How do GNU assembler x86 instruction suffixes like ".s" in "mov.s" work? (GAS Sign or Zero Extension of address in 64bit mode for MOV moffs32? In 64-bit mode, 这篇关于x86汇编16位vs 8位立即操作数编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!adc
/sbb
的特定情况下,避免使用ax, imm16
编码还有另一个优势:请参阅 >
strict
和nosplit
关键字,以及GAS的{vex2}
,{vex3}
,{disp32}
等前缀"
nosplit
强制对LEA进行更有效的编码.{disp32}
等,以及{load}
或{store}
选择要使用的op r/m, r
与op r, r/m
编码中的哪一种.)moffs
编码的a32 mov eax, [0x123456]
会导致Intel CPU上的LCP停顿.绝对寻址比modrm + SIB + disp32短,但可能会更慢.mov rax,1
(5个字节)与mov rax, strict dword 1
(7个字节)与mov rax, strict qword 1
(10个字节imm64
编码)adc ax, 0xff33
gets encoded as 15 33 ff
which is correct.
But would it matter if adc ax, 0x33
gets encoded as 15 33 00
?83 d0 33
which is obviously correct, but is my approach correct as well?op reg, reg
instructions have a choice of encoding via the op r/m, reg
or the op reg, r/m
opcode.mov rax, 1
(7 bytes for mov r64, sign_extended_imm32
) into mov eax, 1
(5 bytes) for x86-64, changing the operand-size to use the zero-extension from writing a 32-bit register instead of explicit sign-extension of a 32-bit immediate.Using the sign-extended-imm8 encoding when available is always good
imm8
.op eax, imm32
is 5 bytes, vs. op r/m32, imm8
still being 3 bytes. (Not counting any prefixes needed to set operand-size or other things; those will be the same for both.)Performance advantages of the imm8 encoding
adc ax, 0x33
), using the adc ax/eax/rax, imm16/32/32
encoding with an operand-size prefix will create an LCP stall on Intel CPUs (Length-Changing Prefix means the prefix changes the length of the rest of the instruction. This doesn't happen for the imm8 encoding because it's still (prefix) + opcode + modrm + imm8 regardless of the operand-size.adc
is a special case.
adc
/sbb
, there is another advantage to avoiding the ax, imm16
encoding: See Which Intel microarchitecture introduced the ADC reg,0 single-uop special case? On Sandybridge through Haswell, adc ax, 0
is special-cased as a single-uop instruction, instead of the normal 2 for a 3-input uop (ax, flags, immediate).adc ax, imm16
still decodes to 2 uops. Only the decoder for the imm8
form checks if the immediate is zero before decoding to a single uop. (And it still doesn't work for adc al, imm8
.)adc ax,0
and thus the LCP-stall issue wouldn't happen.
strict
and nosplit
keywords, and GAS's {vex2}
, {vex3}
, {disp32}
and so on "prefixes"
nosplit
to force a longer more efficient encoding for LEA.{disp32}
and so on, and {load}
or {store}
to choose which of the op r/m, r
vs. op r, r/m
encoding you prefer.)a32 mov eax, [0x123456]
with the no-modrm moffs
encoding causes an LCP stall on Intel CPUs. It's shorter than modrm+SIB+disp32 for absolute addressing, but potentially slower.mov rax,1
(5 bytes) vs. mov rax, strict dword 1
(7 bytes) vs. mov rax, strict qword 1
(10 byte imm64
encoding)