简单指令编码 [英] Simple instruction encode
问题描述
让我们接受以下汇编指令:
Let's take the following assembly instruction:
add %cl,%bl
这被编码为:00
cb
或00000000
11001011
二进制.将cb
放入ModR/M位域中,如下所示:
This gets encoded as: 00
cb
, or 00000000
11001011
in binary. Putting the cb
into the ModR/M bitfields, it looks like:
1 1 0 0 1 0 1 1
+---+---+---+---+---+---+---+---+
| mod | reg | r/m |
+---+---+---+---+---+---+---+---+
然后,客栈在在此处注册字段我们得到:
And, inn looking up the register field here we get:
- mod:
11
(寄存器寻址模式) - reg:
001
(cl寄存器) - r/m:
011
(bl寄存器)
- mod:
11
(Register addressing mode) - reg:
001
(cl register) - r/m:
011
(bl register)
而且,我相信000000ds
是add
指令,而d=s=0
是d=s=0
指令,因为它们都是寄存器.这是对该指令的编码方式的正确理解吗?此外,对于完全编码"方案,以下内容是否准确(以字节为单位,不是位):
And, I believe 000000ds
is the add
instruction, and d=s=0
since they're all registers. Is that a correct inderstanding of how this instruction is encoded? Additionally, for the 'full encoding' scheme, would the following be accurate (in bytes not bits):
[empty] 0x0 0b11001011 [empty] [empty] [empty]
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
Prefix Instruction Mod-reg-r/m Scale displacement immediate
在尝试解码"指令时,这里是否缺少任何内容?
Are there any things I'm missing here in my attempt at 'decoding' the instruction?
推荐答案
是的,看起来不错.
用于编码op r/m, r
vs. op r, r/m
以及8位vs. 16/32位的通用模式(可追溯到8086年的传统" ALU指令)确实使用了操作码的低2位字节以常规模式显示,但不必依赖于此.
The general pattern (for "legacy" ALU instructions that date back to 8086) for encoding op r/m, r
vs. op r, r/m
, and 8-bit vs. 16/32 bit does use the low 2 bits of the opcode byte in a regular pattern, but there's no need to rely on that.
Intel确实在他们的第2卷手册中完全记录了每条指令的每种编码所发生的情况.例如,请参见 add
的Op/En列和Operand编码表. (另请参见 https://ref.x86asm.net/coder64.htm ,该文件还指定了每个操作码都使用哪个操作数).这些都可以让您知道哪些操作码占用ModRM字节,哪些不需要.
Intel does fully document exactly what's going on for each encoding of each instruction in their vol.2 manual. See the Op/En column and Operand Encoding table for add
for example. (See also https://ref.x86asm.net/coder64.htm which also specifies which operand is which for every opcode). These both let you know which opcodes take a ModRM byte and which don't.
这些当然使用Intel语法顺序.通过使用AT& T语法来遵循手册和教程,使您的生活变得更加复杂,这会颠倒操作数列表与Intel和AMD手册的顺序.
These of course use Intel-syntax order. You're making your life more complicated by trying to follow manuals and tutorials while using AT&T syntax which reverses the order of the operand-list vs. Intel and AMD manuals.
例如00 /r
被列出作为MR
操作数编码,从表中我们可以看到操作数1 = ModRM:r/m (r, w)
,因此可以对其进行读写,并由r/m
字段进行编码.操作数2 = ModRM:reg (r)
,因此它是由reg
字段编码的只读源.
e.g. 00 /r
is listed as MR
operand encoding, which from the table we can see is operand 1 = ModRM:r/m (r, w)
, so it's read and written, and encoded by the r/m
field. operand 2 = ModRM:reg (r)
, so it's a read-only source encoded by the reg
field.
有趣的事实:00 00
是add [rax], al
或AT& T add %al, (%rax)
Fun fact: 00 00
is add [rax], al
, or AT&T add %al, (%rax)
请注意,您可以要求GAS选择以下任一编码: x86 XOR操作码差异
Note that you can ask GAS to pick the either encoding: x86 XOR opcode differences
{load} add %cl,%bl # 02 d9
{store} add %cl,%bl # 00 cb
另请参见 MOV r/之间的差异m8,r8和MOV r8,r/m8
这篇关于简单指令编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!