“PUSH"指令的操作可以使用其他指令执行吗? [英] Can a “PUSH” instruction's operation be performed using other instructions?

查看:95
本文介绍了“PUSH"指令的操作可以使用其他指令执行吗?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

目前在我看来,我们拥有像Push"这样的指令的唯一原因是用一条指令替换多个 MOV 和算术指令.

有没有更原始的指令不能完成的PUSH"操作?

PUSH"只是一个编译成多条机器代码指令的助记符吗?

解决方案

Push 是真正的机器指令 (https://www.felixcloutier.com/x86/push) 只是一个汇编宏/伪指令.例如,push rax 的单字节编码为 0x50.

但是是的,您可以使用其他指令来模拟它,例如 sub rsp, 8mov 存储.(这对于像 x86 这样的 CISC 机器来说是正常的!)例如见 x86 汇编中寄存器上使用的 push/pop 指令的功能是什么?

要准确地模拟它(不修改标志),您可以使用 LEA 而不是 ADD/SUB.

 lea rsp, [rsp-8]mov qword [rsp], 123 ;在 64 位模式下推送 123

<块引用>

有没有什么是更原始的指令无法完成的PUSH"?

除了效率和代码大小之外,没有什么重要的.

单个指令是原子的.中断 - 它们要么发生,要么不发生.这通常是完全无关的;异步中断通常不会查看被中断代码的堆栈/寄存器内容.

PUSH 可以在一个字节的机器代码中完成工作,用于推送单个寄存器,或者 2 个字节的小立即数.多指令序列要大得多.8086 的 ISA 的架构师非常专注于使小代码尺寸成为可能,所以是的用一条较短的指令替换几条较长的指令是完全正常的.例如我们有 not 而不是必须使用 xor reg, -1inc 而不是 add reg, 1.(尽管它们都有不同的 FLAGS 语义,NOT 保留标志不变,INC/DEC 保留 CF 不变.)更不用说 x86 的所有其他特殊情况编码,例如 xchg-with-[e/r 的 1 字节编码]斧头.请参阅 https://codegolf.stackexchange.com/questions/132981/tips-for-golfing-in-x86-x64-machine-code

还有效率:PUSH 在 Pentium-M 和更高版本的 CPU 上解码为单个 uop(在融合域中),这要归功于堆栈引擎通过 push/pop 和 call/ret 等指令处理堆栈指针的隐式使用.2 个单独的指令当然解码为至少 2 个 uops.(test/cmp+JCC宏融合的特例除外)

在古老的 P5 Pentium 上,使用单独的 ALU 和 mov 指令模拟 push 实际上是一个胜利 - 在 PPro CPU 不知道如何将复杂的 CISC 指令分解为单独的 uops 之前,复杂的指令无法不与 P5 的双发布有序管道配对.(请参阅 Agner Fog 的微架构指南.)这里的主要好处是能够混合其他可以配对的指令, 并且只做一个大的 sub 然后只是 mov 存储而不是对堆栈指针的多次更改.

这也适用于堆栈引擎之前的早期 P6 系列.例如,带有 -march=pentium3 的 GCC 将倾向于避免 push 并且只是对 ESP 做一个更大的调整.

It currently seems to me that the only reason we have instructions like "Push" is to replace multiple MOV, and arithmetic instructions with a single instruction.

Is there anything "PUSH" does that cannot be accomplished by more primitive instructions?

Is "PUSH" just a single Mnemonic that compiles into multiple machine code instructions?

解决方案

Push is a real machine instruction (https://www.felixcloutier.com/x86/push) not just an assembler macro / pseudo-instruction. For example, push rax is has a single-byte encoding of 0x50.

But yes you can emulate it using other instructions like sub rsp, 8 and a mov store. (This is normal for a CISC machine like x86!) e.g. see What is the function of the push / pop instructions used on registers in x86 assembly?

To emulate it exactly (without modifying flags), you use LEA instead of ADD/SUB.

  lea   rsp, [rsp-8]
  mov   qword [rsp], 123      ; push 123 in 64-bit mode

Is there anything "PUSH" does that cannot be accomplished by more primitive instructions?

Nothing significant beyond efficiency and code-size.

Single instructions are atomic wrt. interrupts - they either happen or they don't. This is normally totally irrelevant; asynchronous interrupts don't usually look at the stack / register contents of the code that got interrupted.

PUSH can get the job done in a single byte of machine code for pushing a single register, or 2 bytes for a small immediate. A multi-instruction sequence is much larger. The architect of 8086's ISA was very focused on making small code-size possible, so yes it's totally normal to have an instruction that replaces a couple longer instructions with one short one. e.g. we have not instead of having to use xor reg, -1, and inc instead of add reg, 1. (Although again those both have different FLAGS semantics, with NOT leaving flags untouched and INC/DEC leaving CF untouched.) Not to mention all of x86's other special-case encodings, like 1-byte encodings for xchg-with-[e/r]ax. See https://codegolf.stackexchange.com/questions/132981/tips-for-golfing-in-x86-x64-machine-code

Also efficiency: PUSH decodes to a single uop (in the fused domain) on Pentium-M and later CPUs, thanks to the stack engine that handles implicit uses of the stack pointer by instructions like push/pop and call/ret. 2 separate instructions of course decode to at least 2 uops. (Except the special case of macro-fusion of test/cmp + JCC).

On ancient P5 Pentium, emulating push with separate ALU and mov instructions was actually a win - before PPro CPUs didn't know how to break down complex CISC instructions into separate uops, and complex instructions couldn't pair in P5's dual-issue in-order pipeline. (See Agner Fog's microarch guide.) The main benefit here was being able to mix in other instructions that could pair, and to only do one big sub and then just the mov stores instead of multiple changes to the stack pointer.

This also applies to early P6-family before the stack engine. GCC with -march=pentium3 for example will tend to avoid push and just do one bigger adjustment to ESP.

这篇关于“PUSH"指令的操作可以使用其他指令执行吗?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆