为什么不允许从内存到内存的 movl? [英] Why isn't movl from memory to memory allowed?

查看:24
本文介绍了为什么不允许从内存到内存的 movl?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道在汇编中是否允许这样做,

 movl (%edx) (%eax)

我猜它会访问第一个操作数中的内存并放入第二个操作数的内存,比如 *a = *b 但我没有看到任何处理这样的例子,所以我猜这是不允许的.另外,我被告知这是不允许的

 leal %esi (%edi)

这是为什么?最后,是否还有其他类似的功能我应该知道是不允许的.

解决方案

movl (mem), (mem)mov dword [eax], [ecx] ;或英特尔语法中的等价物

无效,因为 x86 机器代码没有 mov<的编码/code> 有两个地址.(事实上​​,任何 x86 指令都不可能有两种任意寻址模式.)

它有mov r32, r/m32mov r/m32, r32.Reg-reg 移动可以使用 mov r32, r/m32 操作码或 mov r/m32, r32 操作码进行编码.许多其他指令有两个操作码,一个 dest 必须是寄存器,另一个 src 必须是寄存器.

(还有一些特殊的形式,比如mov r32, imm32,或者movabs r64, [64bit-absolute-address].)

请参阅 x86 指令集参考手册(x86 标签维基中的链接 https://stackoverflow.com/tags/x86/info).我在这里使用了 Intel/NASM 语法,因为这是 insn 参考手册所做的.

很少有指令可以对两个不同的地址进行加载和存储,例如movs (string-move) 和 push/pop (mem) (哪些 x86 指令需要两个(或更多)内存操作数?).在所有这些情况下,至少有一个内存地址是隐式的(由操作码暗示),而不是可以是 [eax][edi + esi*4 +123] 或其他.

许多 ALU 指令可用于存储目标.这是对单个内存位置的读-修改-写,使用相同的寻址模式进行加载然后存储.这表明限制不是 8086 无法加载和存储,而是解码复杂性(和机器代码紧凑性/格式)限制.


没有指令可以取两个任意的有效地址(即使用灵活寻址模式指定).movs 有隐式的 source 和 dest 操作数,而 push 有一个隐式的 dest (esp).

一条 x86 指令最多有一个 ModRM 字节,一个 ModRM 只能编码一个 reg/memory 操作数(模式 2 位,基址寄存器 3 位)和另一个仅寄存器操作数(3 位).使用转义码,ModRM 可以用信号发送一个 SIB 字节来编码内存操作数的基数 + 缩放索引,但仍然只能编码一个内存操作数.

正如我上面提到的,同一条指令(asm source mnemonic)的memory-source和memory-destination形式使用两种不同的操作码.就硬件而言,它们是不同的指令.


这种设计选择的部分原因可能是实现的复杂性:如果一条指令可能需要来自 AGU(地址生成单元)的两个结果,那么必须有布线才能实现这一点.这种复杂性的一部分在于解码器找出操作码是哪条指令,并解析剩余的位/字节以找出操作数是什么.由于没有其他指令可以有多个 r/m 操作数,因此需要额外的晶体管(硅面积)来支持对两种任意寻址模式进行编码的方式.也用于必须弄清楚一条指令有多长的逻辑,因此它知道从哪里开始解码下一条.

它还可能为指令提供五个输入依赖项(存储地址的两个寄存器寻址模式,加载地址和加载日期相同).在设计 8086/80386 时,超标量/乱序/依赖性跟踪可能不在雷达上.386 增加了很多新指令,所以 mov 的 mem-to-mem 编码本来可以完成,但没有完成.如果 386 已经开始将结果直接从 ALU 输出转发到 ALU 输入和类似的东西(与始终将结果提交到寄存器文件相比减少延迟),那么这个原因将是它没有实现的原因之一.

如果存在,英特尔 P6 可能会将其解码为两个独立的 uops,一个加载和一个存储.现在或在 1995 年设计 P6 之后的任何时候引入,更简单的指令比复杂的指令获得更多的速度优势时,引入它当然没有意义.(有关使代码快速运行的内容,请参阅 http://agner.org/optimize/.)>

无论如何,我认为这不是很有用,至少与代码密度的成本相比.如果您想这样做,可能是因为您没有充分利用寄存器.如果可能,弄清楚如何在复制时动态处理数据.当然,有时您只需要先加载然后存储,例如在基于一个成员进行比较后交换结构的其余部分的排序例程中.在较大的块中进行移动(例如使用 xmm 寄存器)是个好主意.


leal %esi, (%edi)

这里有两个问题:

首先,寄存器没有地址.裸 %esi 不是有效的有效地址,因此不是 lea

的有效来源

其次,lea 的目的地必须是一个寄存器.没有编码需要第二个有效地址将目的地存储到内存中.


顺便说一句,两者都无效,因为您省略了两个操作数之间的 ,.

valid-asm.s:2:错误:lea"的操作数数量不匹配

其余的答案仅讨论修复该语法错误后的代码.

I was wondering if this is allowed in assembly,

 movl (%edx) (%eax) 

I would have guessed that it access the memory in the first operand and puts in the memory of the second operand, something like *a = *b but I haven't seen any example dealing with such so I'm guessing it's not allowable. Also, I've been told that this isn't allowed

 leal %esi (%edi)

why is that? Lastly, are there other similar functions I should be aware that aren't allowed.

解决方案

movl (mem), (mem)

mov dword [eax], [ecx]    ; or the equivalent in Intel-syntax

is invalid because x86 machine code doesn't have an encoding for mov with two addresses. (In fact no x86 instruction can ever have two arbitrary addressing modes.)

It has mov r32, r/m32 and mov r/m32, r32. Reg-reg moves can be encoded using either the mov r32, r/m32 opcode or the mov r/m32, r32 opcode. Many other instructions have two opcodes, one where the dest has to be a register, and one where the src has to be a register.

(And there are some specialized forms, like mov r32, imm32, or movabs r64, [64bit-absolute-address].)

See the x86 instruction set reference manual (links in the x86 tag wiki https://stackoverflow.com/tags/x86/info). I used Intel/NASM syntax here because that's what the insn ref manual does.

Very few instructions can do a load and store to two different addresses, e.g. movs (string-move), and push/pop (mem) (What x86 instructions take two (or more) memory operands?). In all of those cases, at least one of the memory addresses is implicit (implied by the opcode), not an arbitrary choice that could be [eax] or [edi + esi*4 + 123] or whatever.

Many ALU instructions are available with a memory destination. This is a read-modify-write on a single memory location, using the same addressing mode for load and then store. This shows that the limit wasn't that 8086 couldn't load and store, it was a decoding complexity (and machine-code compactness / format) limitation.


There are no instructions that take two arbitrary effective-addresses (i.e. specified with a flexible addressing mode). movs has implicit source and dest operands, and push has an implicit dest (esp).

An x86 instruction has at most one ModRM byte, and a ModRM can only encode one reg/memory operand (2 bits for mode, 3 bits for base register), and another register-only operand (3 bits). With an escape code, ModRM can signal a SIB byte to encode base + scaled-index for the memory operand, but there's still only room to encode one memory operand.

As I mentioned above, the memory-source and memory-destination forms of the same instruction (asm source mnemonic) use two different opcodes. As far as the hardware is concerned, they are different instructions.


The reasons for this design choice are probably partly implementation complexity: If it's possible for a single instruction to need two results from an AGU (address-generation-unit), then the wiring has to be there to make that possible. Some of this complexity is in the decoders that figure out which instruction an opcode is, and parse the remaining bits / bytes to figure out what the operands are. Since no other instruction can have multiple r/m operands, it would cost extra transistors (silicon area) to support a way to encode two arbitrary addressing modes. Also for the logic that has to figure out how long an instruction is, so it knows where to start decoding the next one.

It also potentially gives an instruction five input dependencies (two-register addressing mode for the store address, same for the load address, and the load date). When 8086 / 80386 was being designed, superscalar / out-of-order / dependency tracking probably wasn't on the radar. 386 added a lot of new instructions, so a mem-to-mem encoding of mov could have been done, but wasn't. If 386 had started to forward results directly from ALU output to ALU input and stuff like that (to reduce latency compared to always committing results to the register file), then this reason would have been one of the reasons it wasn't implemented.

If it existed, Intel P6 would probably decode it to two separate uops, a load and a store. It certainly wouldn't make sense to introduce now, or any time after 1995 when P6 was designed and simpler instructions gained more of a speed advantage over complex ones. (See http://agner.org/optimize/ for stuff about making code run fast.)

I can't see this being very useful, anyway, at least not compared to the cost in code-density. If you want this, you're probably not making enough use of registers. Figure out how to process your data on the fly while copying, if possible. Of course, sometimes you just have to do a load and then a store, e.g. in a sort routine to swap the rest of a struct after comparing based on one member. Doing moves in larger blocks (e.g. using xmm registers) is a good idea.


leal %esi, (%edi)

Two problems here:

First, registers don't have addresses. A bare %esi is not a valid effective-address, so not a valid source for lea

Second, lea's destination must be a register. There's no encoding where it takes a second effective-address to store the destination to memory.


BTW, neither are valid because you left out the , between the two operands.

valid-asm.s:2: Error: number of operands mismatch for `lea'

The rest of the answer only discusses the code after fixing that syntax error.

这篇关于为什么不允许从内存到内存的 movl?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆