简单指令编码 [英] Simple instruction encode

查看:130
本文介绍了简单指令编码的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

让我们接受以下汇编指令:

Let's take the following assembly instruction:

add    %cl,%bl

这被编码为:00 cb00000000 11001011二进制.将cb放入ModR/M位域中,如下所示:

This gets encoded as: 00 cb, or 00000000 11001011 in binary. Putting the cb into the ModR/M bitfields, it looks like:

  1   1   0   0   1   0  1   1
+---+---+---+---+---+---+---+---+
|  mod  |    reg    |    r/m    |
+---+---+---+---+---+---+---+---+

然后,客栈在在此处注册字段我们得到:

And, inn looking up the register field here we get:

  • mod:11(寄存器寻址模式)
  • reg:001(cl寄存器)
  • r/m:011(bl寄存器)
  • mod: 11 (Register addressing mode)
  • reg: 001 (cl register)
  • r/m: 011 (bl register)

而且,我相信000000dsadd指令,而d=s=0d=s=0指令,因为它们都是寄存器.这是对该指令的编码方式的正确理解吗?此外,对于完全编码"方案,以下内容是否准确(以字节为单位,不是位):

And, I believe 000000ds is the add instruction, and d=s=0 since they're all registers. Is that a correct inderstanding of how this instruction is encoded? Additionally, for the 'full encoding' scheme, would the following be accurate (in bytes not bits):

[empty]         0x0         0b11001011     [empty]        [empty]          [empty]
_ _ _ _        _ _             _              _           _ _ _ _          _ _ _ _
Prefix      Instruction    Mod-reg-r/m      Scale       displacement      immediate

在尝试解码"指令时,这里是否缺少任何内容?

Are there any things I'm missing here in my attempt at 'decoding' the instruction?

推荐答案

是的,看起来不错.

用于编码op r/m, r vs. op r, r/m以及8位vs. 16/32位的通用模式(可追溯到8086年的传统" ALU指令)确实使用了操作码的低2位字节以常规模式显示,但不必依赖于此.

The general pattern (for "legacy" ALU instructions that date back to 8086) for encoding op r/m, r vs. op r, r/m, and 8-bit vs. 16/32 bit does use the low 2 bits of the opcode byte in a regular pattern, but there's no need to rely on that.

Intel确实在他们的第2卷手册中完全记录了每条指令的每种编码所发生的情况.例如,请参见 add 的Op/En列和Operand编码表. (另请参见 https://ref.x86asm.net/coder64.htm ,该文件还指定了每个操作码都使用哪个操作数).这些都可以让您知道哪些操作码占用ModRM字节,哪些不需要.

Intel does fully document exactly what's going on for each encoding of each instruction in their vol.2 manual. See the Op/En column and Operand Encoding table for add for example. (See also https://ref.x86asm.net/coder64.htm which also specifies which operand is which for every opcode). These both let you know which opcodes take a ModRM byte and which don't.

这些当然使用Intel语法顺序.通过使用AT& T语法来遵循手册和教程,使您的生活变得更加复杂,这会颠倒操作数列表与Intel和AMD手册的顺序.

These of course use Intel-syntax order. You're making your life more complicated by trying to follow manuals and tutorials while using AT&T syntax which reverses the order of the operand-list vs. Intel and AMD manuals.

例如00 /r列出作为MR操作数编码,从表中我们可以看到操作数1 = ModRM:r/m (r, w),因此可以对其进行读写,并由r/m字段进行编码.操作数2 = ModRM:reg (r),因此它是由reg字段编码的只读源.

e.g. 00 /r is listed as MR operand encoding, which from the table we can see is operand 1 = ModRM:r/m (r, w), so it's read and written, and encoded by the r/m field. operand 2 = ModRM:reg (r), so it's a read-only source encoded by the reg field.

有趣的事实:00 00add [rax], al或AT& T add %al, (%rax)

Fun fact: 00 00 is add [rax], al, or AT&T add %al, (%rax)

请注意,您可以要求GAS选择以下任一编码: x86 XOR操作码差异

Note that you can ask GAS to pick the either encoding: x86 XOR opcode differences

{load}  add    %cl,%bl        # 02 d9
{store} add    %cl,%bl        # 00 cb

另请参见 MOV r/之间的差异m8,r8和MOV r8,r/m8

这篇关于简单指令编码的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆