什么是微码指令? [英] What is a microcoded instruction?

查看:326
本文介绍了什么是微码指令?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我已经看到很多参考微码指令的文献.

I have seen a lot of literature referencing microcoded instructions.

这些是什么,为什么要使用它们?

What are these and why they are used?

推荐答案

CPU读取机器代码并将其解码为内部控制信号,这些信号会将正确的数据发送到正确的执行单元.

A CPU reads machine code and decodes it into internal control signals that send the right data to the right execution units.

大多数指令映射到一个内部操作,并且可以直接解码. (例如,在x86上,add eax, edx只是将eax和edx发送到整数ALU以进行ADD操作,并将结果放入eax中.)

Most instructions map to one internal operation, and can be decoded directly. (e.g. on x86, add eax, edx just sends eax and edx to the integer ALU for an ADD operation, and puts the result in eax.)

其他一些单一指令可以做更多的工作.例如x86的 rep movs 实现了memcpy(edi, esi, ecx),并且需要CPU循环.

Some other single instructions do much more work. e.g. x86's rep movs implements memcpy(edi, esi, ecx), and requires the CPU to loop.

当指令解码器看到这样的指令时,它们不只是直接产生内部控制信号,而是从微码ROM中读取微码.

When the instruction decoders see an instruction like that, instead of just producing internal control signals directly they read micro-code out of the microcode ROM.

微指令是一种可以解码为许多内部操作的指令

现代x86 CPU始终将x86指令解码为内部微操作.在该术语中,它仍然不算作微编码".即使add [mem], eax[mem]解码为负载,进行ALU ADD操作并存储回[mem].另一个示例是xchg eax, edx,其中

Modern x86 CPUs always decode x86 instructions to internal micro-operations. In this terminology, it still doesn't count as "micro-coded" even when add [mem], eax decodes to a load from [mem], an ALU ADD operation, and a store back into [mem]. Another example is xchg eax, edx, which decodes to 3 uops on Intel Haswell. And interestingly, not exactly the same kind of uops you'd get from using 3 MOV instructions to do the exchange with a scratch register, because they aren't zero-latency.

在Intel/AMD CPU上,微编码"意味着解码器打开微码定序器,将ROM中的uops馈送到流水线中,而不是直接产生多个uops.

On Intel / AMD CPUs, "micro-coded" means the decoders turn on the micro-code sequencer to feed uops from the ROM into the pipeline, instead of producing multiple uops directly.

(如果您使用纯RISC术语进行思考,则可以将任何多uu x86指令称为微码",但是使用术语微码"来区分IMO是有用的.

我认为这种含义在x86优化领域很普遍,例如Intel的优化手册.其他人可能会使用不同的术语含义,特别是在将x86与RISC比较时谈论其他体系结构或计算机体系结构时.)

(You could call any multi-uop x86 instruction "microcoded" if you were thinking in pure RISC terms, but it's useful to use the term "microcoded" to make a different distinction, IMO. This meaning is I think widespread in x86 optimization circles, like Intel's optimization manual. Other people may use different meanings for terminology, especially if talking about other architectures or about computer architecture in general when comparing x86 to a RISC.)

在当前的Intel CPU中,不使用微代码ROM即可直接产生解码器的限制为4微妙(融合域). AMD类似地具有FastPath(aka DirectPath)单指令或双指令(1或2个宏操作",相当于AMD的uops),除此之外,它还有VectorPath aka微码,如

In current Intel CPUs, the limit on what the decoders can produce directly, without going to micro-code ROM, is 4 uops (fused-domain). AMD similarly has FastPath (aka DirectPath) single or double instructions (1 or 2 "macro-ops", AMD's equivalent of uops), and beyond that it's VectorPath aka Microcode, as explained in David Kanter's in-depth look at AMD Bulldozer, specifically talking about its decoders.

另一个例子是x86的整数DIV指令,即使在像Haswell这样的现代Intel CPU上,它也被微编码.但不是AMD;而是AMD. AMD仅有一或两个微指令激活整数除法器单元中的所有内容.它不是DIV的基础,只是实现选择.请参阅 C ++代码以更快地测试Collat​​z猜想的答案而不是手写组装-为什么?输入数字.

Another example is x86's integer DIV instruction, which is micro-coded even on modern Intel CPUs like Haswell. But not AMD; AMD just has one or 2 uops activate everything inside the integer divider unit. It's not fundamental to DIV, just an implementation choice. See my answer on C++ code for testing the Collatz conjecture faster than hand-written assembly - why? for the numbers.

FP划分也很慢,但是会解码为单个uop,因此不会成为前端的瓶颈.如果FP划分很少见,并且不是延迟瓶颈的一部分,那么它的价格与乘法一样便宜. (但是,如果执行必须等待其结果或吞吐量瓶颈,则执行速度会慢很多.)

FP division is also slow, but is decoded to a single uop so it doesn't bottleneck the front-end. If FP division is rare and not part of a latency bottleneck, it can be as cheap as multiplication. (But if execution does have to wait for its result, or bottlenecks on its throughput, it's much slower.) More in this answer.

整数除法和其他微指令会给CPU带来麻烦,并且

Integer division and other micro-coded instructions can give the CPU a hard time, and creates effects that make code alignment matter where it wouldn't otherwise.

要了解有关x86 CPU内部的更多信息,请参见标签Wiki,尤其是 Agner Fog的微体系结构指南.

To learn more about x86 CPU internals, see the x86 tag wiki, and especially Agner Fog's microarch guide.

David Kanter对x86微体系结构的深入研究对于理解uops所经历的管道也很有用: Core 2 Sandy Bridge 是主要的,AMD K8和Bulldozer文章也很有趣比较.

Also David Kanter's deep dives into x86 microarchitectures are useful to understand the pipeline that uops go through: Core 2 and Sandy Bridge being major ones, also AMD K8 and Bulldozer articles are interesting for comparison.

RISC与CISC仍然很重要(2000年2月) Paul DeMone 研究了PPro如何将指令分解为uops,而RISC则是大多数指令只需一步就可以通过管道完成的RISC,只有ARM push/pop多个寄存器等稀有指令需要发送多个指令管道中的所有内容(又名RISC术语微编码).

RISC vs. CISC Still Matters (Feb 2000) by Paul DeMone looks at how PPro breaks down instructions into uops, vs. RISCs where most instructions are already simple to just go through the pipeline in one step, with only rare ones like ARM push/pop multiple registers needing to send multiple things down the pipeline (aka microcoded in RISC terms).

出于充分的考虑,现代微处理器 始终值得推荐90分钟指南!作为流水线和OoO执行程序的基础.

And for good measure, Modern Microprocessors A 90-Minute Guide! is always worth recommending for the basics of pipelining and OoO exec.

在某些较旧/较简单的CPU中,每条指令均有效地进行了微编码.例如,6502通过运行来自PLA解码ROM的内部指令序列来执行6502指令.这对于非流水线的CPU效果很好,在不同的指令中,使用CPU不同部分的顺序可能会有所不同.

In some older / simpler CPUs, every instruction was effectively micro-coded. For example, the 6502 executed 6502 instructions by running a sequence of internal instructions from a PLA decode ROM. This works well for a non-pipelined CPU, where the order of using the different parts of the CPU can vary from instruction to instruction.

从历史上看,微码" 具有不同的技术含义,类似于从指令字中解码的内部控制信号.特别是在像MIPS这样的CPU中,指令字直接映射到那些控制信号,而无需复杂的解码. (我可能部分错了;我读到了类似的内容(除了在该问题的已删除答案中),但稍后找不到.)

Historically, there was a different technical meaning for "microcode", meaning something like the internal control signals decoded from the instruction word. Especially in a CPU like MIPS where the instruction word mapped directly to those control signals, without complicated decoding. (I may have this partly wrong; I read something like this (other than in the deleted answer on this question) but couldn't find it again later.)

实际上,这种含义可能仍然在某些领域得到应用,例如在设计简单的流水线CPU时(例如爱好MIPS).

This meaning may still actually get used in some circles, like when designing a simple pipelined CPU, like a hobby MIPS.

这篇关于什么是微码指令?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆