rbp 不允许作为 SIB 基础? [英] rbp not allowed as SIB base?

查看:22
本文介绍了rbp 不允许作为 SIB 基础?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对 x86-64 二进制编码很陌生.我正在尝试修复一些旧的汇编"代码.

无论如何,我正在尝试做这样的事情(英特尔语法):

mov [rbp+rcx], al

汇编器当前正在生成:

88 04 0D

但这似乎不是一个有效的指令.如果我将 SIB 字节中的基址从 rbp 更改为其他一些寄存器,则它可以正常工作.另一种使其工作的方法是添加一个一字节的零位移(88 44 0D 00).这似乎发生在其他类似的操作码上.

为什么我不能将 rbpmod=00 一起使用?

解决方案

另见 https://wiki.osdev.org/X86-64_Instruction_Encoding#32.2F64-bit_addressing_2 或英特尔的第 2 卷手册,了解这些特殊情况的编码表和脚注. 这个答案指出特殊情况,并讨论为什么这些设计选择有意义,即他们需要解决什么设计问题.


表示 rbp 的编码是无基址寄存器的转义码(只是 SIB 中的 disp32 或 ModRM 中 RIP 相关的 rel32).大多数汇编程序将[rbp] 汇编成[rbp + disp8=0].

因为不需要缩放,所以使用 [rcx + rbp] 代替,以避免需要 disp8=0,因为 rbp 可以 成为索引.

(SS 和 DS 在长模式下总是等价的,所以 base=RBP 意味着 SS 而 base=RCX 意味着使用 DS 段并不重要.)


x86/x86-64 ModRM 寻址模式编码特殊情况

(来自我在 为什么 rbp 和 rsp 被称为通用寄存器?).这个问题看起来很适合复制或移植这部分.

rbp/r13 不能是没有位移的基址寄存器:该编码意味着:(在 ModRM 中)rel32(RIP 相对),或(在 SIB 中)disp32 没有基址寄存器.(r13 在 ModRM/SIB 中使用相同的 3 位,因此此选择通过不让指令长度解码器查看 REX.B 位 以获得第 4 个基址寄存器位).[r13] 汇编为 [r13 + disp8=0].[r13+rdx] 组合成 [rdx+r13](通过交换基数/索引来避免这个问题)

rsp/r12 作为基址寄存器总是需要一个 SIB 字节.(base=RSP 的 ModR/M 编码是用于发送 SIB 字节信号的转义码,同样,如果 r12 的处理方式不同,更多的解码器将不得不关心 REX 前缀.>

rsp 不能是索引寄存器.这使得对 [rsp] 进行编码成为可能,这比 [rsp + rsp] 更有用.(英特尔本可以为 32 位寻址模式设计 ModRM/SIB 编码(386 中的新功能),因此只有 base=ESP 才能使用 SIB-with-no-index.这将使 [eax + esp*4] 可能并且只排除 [esp + esp*1/2/4/8].但这没有用,所以他们通过使 index=ESP 成为无索引的代码来简化硬件,不管基地址.这允许使用两种冗余方式来编码任何基址或基址+disp 寻址模式:有或没有 SIB.)

r12 可以作为索引寄存器.与其他情况不同,这不会影响指令长度解码.此外,它无法像其他情况一样使用更长的编码来解决.AMD 希望 AMD64 的寄存器组尽可能正交,因此他们花一些额外的晶体管来检查 REX.X 作为索引/无索引解码的一部分是有道理的.例如,[rsp + r12*4] 需要 index=r12,因此如果 r12 不是完全通用的,会使 AMD64 成为更糟糕的编译器目标.

 0: 41 8b 03 mov eax,DWORD PTR [r11]3: 41 8b 04 24 mov eax,DWORD PTR [r12] # 需要像 RSP 这样的 SIB7: 41 8b 45 00 mov eax,DWORD PTR [r13+0x0] # 需要像 RBP 这样的 disp8b: 41 8b 06 mov eax,DWORD PTR [r14]e: 41 8b 07 mov eax,DWORD PTR [r15]11: 43 8b 04 e3 mov eax,DWORD PTR [r11+r12*8] # *可以*是一个索引

这些都适用于 32 位寻址模式;除了没有与 EIP 相关的编码外,编码是相同的,只有两种冗余方式来编码没有 base 的 disp32.


<块引用>

这似乎发生在其他类似的操作码上.

r/m 操作数的 ModRM 编码始终相同.有些操作码需要寄存器操作数,有些需要内存,但实际的 ModRM + 可选 SIB + 可选位移是固定的,因此无论指令如何,相同的硬件都可以对其进行解码.

有一些罕见的操作码,如 mov al/ax/eax/rax, [qword absolute_address] 根本不使用 ModRM 编码作为其操作数,但任何使用相同的操作码格式.

I'm quite new to x86-64 binary encoding. I'm trying to fix some old "assembler" code.

Anyways, I'm trying to do something like this (Intel syntax):

mov    [rbp+rcx], al

The assembler is currently generating this:

88 04 0D

but that doesn't seem to be a valid instruction. If I change out the base in the SIB-byte from rbp to some other register, it works fine. Another way to make it work is to add a one byte displacement of zero (88 44 0D 00). This seems to happen with other similar opcodes.

Why can't I use rbp there with mod=00?

解决方案

See also https://wiki.osdev.org/X86-64_Instruction_Encoding#32.2F64-bit_addressing_2 or Intel's vol.2 manual for tables of encodings and footnotes for these special cases. This answer points out the special cases, and talks about why those design choices make some sense, i.e. what design problem they needed to solve.


The encoding that would mean rbp is an escape code for no base register (just a disp32 in SIB or RIP-relative rel32 in ModRM). Most assemblers assemble [rbp] into [rbp + disp8=0].

Since you don't need it scaled, use [rcx + rbp] instead to avoid needing a disp8=0, because rbp can be an index.

(SS and DS are always equivalent in long mode, so it doesn't matter that base=RBP implies SS while base=RCX implies using the DS segment.)


x86 / x86-64 ModRM addressing mode encoding special cases

(from an answer I wrote on Why are rbp and rsp called general purpose registers?). This question looks like the perfect place to copy or transplant this section.

rbp/r13 can't be a base register with no displacement: that encoding instead means: (in ModRM) rel32 (RIP-relative), or (in SIB) disp32 with no base register. (r13 uses the same 3 bits in ModRM/SIB, so this choice simplifies decoding by not making the instruction-length decoder look at the REX.B bit to get the 4th base-register bit). [r13] assembles to [r13 + disp8=0]. [r13+rdx] assembles to [rdx+r13] (avoiding the problem by swapping base/index when that's an option).

rsp/r12 as a base register always needs a SIB byte. (The ModR/M encoding of base=RSP is escape code to signal a SIB byte, and again, more of the decoder would have to care about the REX prefix if r12 was handled differently).

rsp can't be an index register. This makes it possible to encode [rsp], which is more useful than [rsp + rsp]. (Intel could have designed the ModRM/SIB encodings for 32-bit addressing modes (new in 386) so SIB-with-no-index was only possible with base=ESP. That would make [eax + esp*4] possible and only exclude [esp + esp*1/2/4/8]. But that's not useful, so they simplified the hardware by making index=ESP the code for no index regardless of the base. This allows two redundant ways to encode any base or base+disp addressing mode: with or without a SIB.)

r12 can be an index register. Unlike the other cases, this doesn't affect instruction-length decoding. Also, it couldn't have been worked around with a longer encoding like the other cases can. AMD wanted AMD64's register set to be as orthogonal as possible, so it makes sense they'd spend a few extra transistors to check REX.X as part of the index / no-index decoding. For example, [rsp + r12*4] requires index=r12, so having r12 not fully generally purpose would make AMD64 a worse compiler target.

   0:   41 8b 03                mov    eax,DWORD PTR [r11]
   3:   41 8b 04 24             mov    eax,DWORD PTR [r12]      # needs a SIB like RSP
   7:   41 8b 45 00             mov    eax,DWORD PTR [r13+0x0]  # needs a disp8 like RBP
   b:   41 8b 06                mov    eax,DWORD PTR [r14]
   e:   41 8b 07                mov    eax,DWORD PTR [r15]
  11:   43 8b 04 e3             mov    eax,DWORD PTR [r11+r12*8] # *can* be an index

These all apply to 32-bit addressing modes as well; the encoding is identical except there's no EIP-relative encoding, just two redundant ways to encode disp32 with no base.


This seems to happen with other similar opcodes.

ModRM encoding of r/m operands is always the same. Some opcodes require a register operand, and some require memory, but the actual ModRM + optional SIB + optional displacement is fixed so the same hardware can decode it regardless of the instruction.

There are a few rare opcodes like mov al/ax/eax/rax, [qword absolute_address] that don't use ModRM encoding at all for their operands, but any that do use the same format.

这篇关于rbp 不允许作为 SIB 基础?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆