MOV moffs32 在 64 位模式下对地址进行符号或零扩展? [英] Sign or Zero Extension of address in 64bit mode for MOV moffs32?

查看:27
本文介绍了MOV moffs32 在 64 位模式下对地址进行符号或零扩展?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

让我们将指令 MOV EAX,[0xFFFFFFFF]64 位模式 编码为 67A1FFFFFFFF(有效地址大小通过 67 前缀从默认 64 位切换到 32 位).

英特尔的指令参考手册(文档订单号:325383-057US 从 2015 年 12 月开始)在第 Vol.2A 2-11 说:

<块引用>

2.2.1.3 位移
64 位模式下的寻址使用现有的 32 位 ModR/M 和 SIB 编码.ModR/M 和 SIB 大小不变.他们保留 8 位或 32 位,符号扩展为 64 位.

这表明 32 位位移应该符号扩展,但我不确定这是否也涉及特殊的 moffs 寻址模式.在下一页英特尔说:

<块引用>

2.2.1.6 RIP 相对寻址

RIP 相对寻址由 64 位模式启用,而不是由 64 位地址大小启用.使用地址大小前缀不会禁用 RIP 相对寻址.这地址大小前缀的作用是截断和零扩展计算有效地址为 32 位.

这表明在相对寻址模式下,disp32 被符号扩展到 64 位,添加到 RIP,然后被截断和零扩展.但是我不确定同样的规则是否适用于绝对寻址模式,这是 MOV moffs 操作的情况.

EAX 将从哪个地址加载,A) FFFFFFFFFFFFFFFF 或 B) 00000000FFFFFFFF ?

解决方案

67 A1 FFFFFFFF 没有使用 disp32 寻址模式,因此 Mod/RM 部分文档不适用.

英特尔的 x86 手册 vol.1 说:

<块引用>

所有 16 位和 32 位地址计算都在 IA-32e 模式下进行零扩展以形成 64 位地址.地址计算首先被截断为当前模式(64 位模式或兼容模式)的有效地址大小,由任何地址大小前缀覆盖.然后将结果零扩展到完整的 64 位地址宽度.[...] 64 位模式下生成的 32 位地址只能访问 64 位模式有效地址的低 4 GB.

这适用于 mov 的特殊 moffs 绝对寻址形式 以及常规的 ModR/M 寻址模式,例如 mov eax, [edi] 而不是 mov eax, [rdi].

请注意,moffs8/16/32/64 命名显示的是操作数大小,而不是地址大小(例如 mov al, moffs8).64 位模式下的 32 位地址大小 moffs 没有不同的术语.

address-size 前缀将 A1 操作码从 64 位立即地址更改为 32 位,即它更改指令的 rest 的长度(与 64 位模式下的 ModR/M 寻址模式不同,后者总是 disp0/8/32).根据我的测试,这实际上 导致 Skylake 上的 LCP 停顿,对于a32 mov eax, [abs buf] (对于这种情况,NASM 选择使用 moffs 编码,因为指定了 a32 覆盖,它比 ModR/M + disp32 短)

另见 长度更改前缀 (LCP) 是否会在简单的 x86_64 指令上导致停顿? 有关 LCP 停顿的更多详细信息,包括67h 地址大小前缀.


无论如何,这意味着将其反汇编为 mov eax, [0xFFFFFFFF] 是错误的(至少在 NASM 语法中是这样),因为它会重新组装成执行不同操作的指令.

正确的 YASM/NASM 语法 将组合回到那个机器码是

a32 mov eax, [0xFFFFFFFF]

NASM 也接受 mov eax, [a32 0xFFFFFFFF],但 YASM 不接受.


GNU as 也提供了一种表达方式(不使用 .byte):
addr32 mov 0xffffffff,%eax

movl 0x7FFFFFFF, %eax # 8B mod/rm disp32movl 0xFFFFFFFF, %eax # A1 64bit-moffs32:较旧的 GAS 版本可能需要 movabs 助记符来强制使用 moffs 编码movabs 0x7FFFFF, %eax # A1 64b-moffs32: movabs 强制 MOFFSmovabs 0xFFFFFFFF, %rax # REX A1 64b-moffs64movabs 0xFFFF, %ax #66 A1 64b-moffs64: 操作数大小前缀.byte 0x67, 0xa1, 0xff, 0xff, 0xff, 0xff # 反汇编为 addr32 mov 0xffffffff,%eax# 并且该语法用作汇编器输入:addr32 mov 0xffffffff,%eax #67 A1 FF FF FF FF: 32b-moffs32


使用 NASM/YASM,无法以拒绝与 AL/AX/EAX/RAX 以外的寄存器组合的方式强制 32 位 MOFFS 编码.a32 mov [0xfffffff], cl 汇编为 67 88 0c 25 ff ff ff 0f addr32 mov BYTE PTR ds:0xfffffff,cl(<代码>mov r/m8, r8).

您可以编写 mov eax, [qword 0xffff...] 来获得 moffs64 编码,但没有办法要求 32 位 moffs 编码.


Agner Fog 的 objconv 反汇编器弄错了(从上面的块中反汇编使用 GNU as 生成的机器代码).objconv 似乎采用符号扩展.(它将机器代码作为 prefixes: opcode,operands 放在注释中)

<代码>;注:无重定位的绝对内存地址mov eax, dword [abs qword 7FFFFFH] ;0033 _ A1, 00000000007FFFFF...;注:无重定位的绝对内存地址mov eax, 双字 [0FFFFFFFFFFFFFFFFH] ;0056_67:A1,FFFFFFFF

ndisasm -b64 也反汇编不正确,代码甚至不能以同样的方式工作:

00000073 A1FFFF7F00000000 mov eax,[qword 0x7fffff]-00...00000090 67A1FFFFFFFF mov eax,[0xffffffff]

如果不使用 a32 关键字,我会期望像 mov eax, [qword 0xffffffff] 这样的反汇编.这将组装成一个 64 位 moff,它引用与原始地址相同的地址,但更长.可能在向 ndisasm 添加 AMD64 支持时忽略了这一点,该支持在 AMD64 之前就已经存在.

Let's have an instruction MOV EAX,[0xFFFFFFFF] encoded in 64bit mode as 67A1FFFFFFFF (effective address-size is toggled by 67 prefix from default 64 to 32 bits).

Intel's instruction reference manual (doc Order Number: 325383-057US from December 2015) on page Vol. 2A 2-11 says:

2.2.1.3 Displacement
Addressing in 64-bit mode uses existing 32-bit ModR/M and SIB encodings. The ModR/M and SIB sizes do not change. They remain 8 bits or 32 bits and are sign-extended to 64 bits.

This suggests that 32bit displacement should be sign-extended but I am not sure if this concerns special moffs addressing mode as well. On the next page Intel says:

2.2.1.6 RIP-Relative Addressing

RIP-relative addressing is enabled by 64-bit mode, not by a 64-bit address-size. The use of the address-size prefix does not disable RIP-relative addressing. The effect of the address-size prefix is to truncate and zero-extend the computed effective address to 32 bits.

This suggests that in relative addressing mode the disp32 is sign-extended to 64 bit, added to RIP and then truncated and zero-extended. Hovever I am not sure if the same rule applies to absolute addressing mode, which is the case of MOV moffs operations.

What address will be EAX loaded from, A) FFFFFFFFFFFFFFFF or B) 00000000FFFFFFFF ?

解决方案

67 A1 FFFFFFFF isn't using a disp32 addressing mode, so the Mod/RM section of the documentation doesn't apply.

Intel's x86 manual vol.1 says:

All 16-bit and 32-bit address calculations are zero-extended in IA-32e mode to form 64-bit addresses. Address calculations are first truncated to the effective address size of the current mode (64-bit mode or compatibility mode), as overridden by any address-size prefix. The result is then zero-extended to the full 64-bit address width. [...] A 32-bit address generated in 64-bit mode can access only the low 4 GBytes of the 64-bit mode effective addresses.

This applies to the special moffs absolute addressing forms of mov as well as to regular ModR/M addressing modes like mov eax, [edi] instead of mov eax, [rdi].

Note that the moffs8/16/32/64 naming shows the operand-size, not the address size (e.g. mov al, moffs8). There isn't a different term for a 32-bit address size moffs in 64-bit mode.

The address-size prefix changes the A1 opcode from a 64-bit immediate address to a 32-bit, i.e. it changes the length of the rest of the instruction (unlike ModR/M addressing mode in 64-bit mode, which are always disp0/8/32). This actually causes LCP stalls on Skylake, according to my testing, for a32 mov eax, [abs buf] (NASM chooses to use the moffs encoding for that case, because with the a32 override specified, it's shorter than ModR/M + disp32)

See also Does a Length-Changing Prefix (LCP) incur a stall on a simple x86_64 instruction? for much more detail about LCP stalls in general, including with 67h address-size prefixes.


Anyway, this means that disassembling it as mov eax, [0xFFFFFFFF] is wrong (at least in NASM syntax), because that assembles back into an instruction that does something different.

The correct YASM/NASM syntax that will assemble back to that machine code is

a32 mov eax, [0xFFFFFFFF]

NASM also accepts mov eax, [a32 0xFFFFFFFF], but YASM doesn't.


GNU as also provides a way to express it (without using .byte):
addr32 mov 0xffffffff,%eax

movl    0x7FFFFFFF, %eax  # 8B mod/rm disp32
movl    0xFFFFFFFF, %eax  # A1 64bit-moffs32: Older GAS versions may have required the movabs mnemonic to force a moffs encoding

movabs  0x7FFFFF, %eax    #     A1 64b-moffs32: movabs forces MOFFS
movabs  0xFFFFFFFF, %rax  # REX A1 64b-moffs64
movabs  0xFFFF, %ax       #  66 A1 64b-moffs64: operand-size prefix

.byte 0x67, 0xa1, 0xff, 0xff, 0xff, 0xff  # disassembles to  addr32 mov 0xffffffff,%eax
                                          # and that syntax works as assembler input:
addr32 mov 0xffffffff,%eax    # 67 A1 FF FF FF FF:  32b-moffs32


With NASM/YASM, there's no way to force a 32-bit MOFFS encoding in a way that refuses assemble with a register other than AL/AX/EAX/RAX. a32 mov [0xfffffff], cl assembles to 67 88 0c 25 ff ff ff 0f addr32 mov BYTE PTR ds:0xfffffff,cl (a ModR/M + disp32 encoding of mov r/m8, r8).

You can write mov eax, [qword 0xffff...] to get the moffs64 encoding, but there's no way to require a 32-bit moffs encoding.


Agner Fog's objconv disassembler gets it wrong (disassembling the machine code produced with GNU as from the block above). objconv appears to assume sign-extension. (It puts the machine code in comments as prefixes: opcode, operands)

; Note: Absolute memory address without relocation
    mov     eax, dword [abs qword 7FFFFFH]          ; 0033 _ A1, 00000000007FFFFF
 ...
; Note: Absolute memory address without relocation
    mov     eax, dword [0FFFFFFFFFFFFFFFFH]         ; 0056 _ 67: A1, FFFFFFFF

ndisasm -b64 also disassembles incorrectly, to code that doesn't even work the same way:

00000073  A1FFFF7F00000000  mov eax,[qword 0x7fffff]
         -00
...
00000090  67A1FFFFFFFF      mov eax,[0xffffffff]

I would have expected a disassembly like mov eax, [qword 0xffffffff], if it's not going to use the a32 keyword. That would assemble to a 64-bit moffs that references the same address as the original, but is longer. Probably this was overlooked when adding AMD64 support to ndisasm, which already existed before AMD64.

这篇关于MOV moffs32 在 64 位模式下对地址进行符号或零扩展?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆