x86寄存器:MBR / MDR和指令寄存器 [英] x86 registers: MBR/MDR and instruction registers

查看:195
本文介绍了x86寄存器:MBR / MDR和指令寄存器的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

据我了解,IA-32体系结构具有10个32位和6个16位寄存器。

From what I have read, the IA-32 architecture has ten 32-bit and six 16-bit registers.

32位寄存器如下:

The 32-bit registers are as follows:


  • 数据寄存器-EAX,EBX,ECX,EDX

  • 指针寄存器-EIP,ESP, EBP

  • 索引寄存器-ESI,EDI

  • 控制寄存器-EFLAG(EIP也归类为控制寄存器)

  • Data registers - EAX, EBX, ECX, EDX
  • Pointer registers - EIP, ESP, EBP
  • Index registers - ESI, EDI
  • Control registers - EFLAG (EIP is also classified as a control register)

16位寄存器如下:


  • 代码段:它包含所有要执行的指令。

  • 数据段:包含数据,常量和工作区。

  • 堆栈段:包含数据和返回过程或子例程的地址。

  • 附加段(ES)。指向额外数据的指针。

  • F段(FS)。指向更多额外数据的指针。

  • G段(GS)。指向更多数据的指针。

  • Code Segment: It contains all the instructions to be executed.
  • Data Segment: It contains data, constants and work areas.
  • Stack Segment: It contains data and return addresses of procedures or subroutines.
  • Extra Segment (ES). Pointer to extra data.
  • F Segment (FS). Pointer to more extra data.
  • G Segment (GS). Pointer to still more extra data.

但是,我在当前指令寄存器(CIR)或内存缓冲区上找不到任何信息寄存器(MBR)/内存数据寄存器(MBR)。这些寄存器是否被称为别的东西?这些寄存器是32位的吗?

However, I can't find any information on the Current Instruction Register (CIR) or Memory Buffer Registers (MBR)/Memory Data Registers (MBR). Are these registers referred to as something else? And are these registers 32-bit?

我假设它们是32位的,并且该体系结构下最常用的指令的长度不足4个字节。从观察来看,许多指令似乎不足4个字节,例如:

I assume they are 32-bit and that most commonly used instructions under this architecture are under 4 bytes long. From observation, many instructions seem to be under 4 bytes, for example:


  • PUSH EBP(55)

  • MOV EBP,ESP(8B EC)

  • LEA(8D 44 38 02)

对于更长的指令,CPU将使用前缀代码和其他可选代码。较长的指令需要一个以上的周期才能完成,这取决于指令的长度。

For longer instruction, the CPU will use prefix codes and other optional codes. Longer instructions will require more than one cycle to complete which will depend on instruction length.

Am我正确地说,所讨论的寄存器长度为32位吗?在IA-32体系结构中还有其他我应该注意的寄存器吗?

Am I correct in that the registers in question are 32-bit in length? And are there any other registers in the IA-32 architecture that I should also be aware of?

推荐答案

不,您需要注册在谈论的是在现代x86 CPU中作为物理寄存器不存在的实现细节。

No, the registers you're talking about are an implementation detail that don't exist as physical registers in modern x86 CPUs.

x86没有指定您所需要的任何实现细节在玩具/教学CPU设计中找到。 x86手册仅指定在结构上可见的内容。

Intel和AMD的优化手册对内部实现进行了详细介绍,这与您所需要的完全不同。在暗示。现代x86 CPU将体系结构寄存器重命名为更大的物理寄存器文件,从而可以无序执行,而不会因写后写或读后写数据的危害而停顿。 (请参阅为何mulss需要在Haswell上只有3个周期,不同于Agner的指令表?(有关寄存器重命名的更多详细信息)。请参阅此答案乱序执行的基本介绍,以及实际Haswell核心的框图。 (并且请记住,物理芯片具有多个内核。)

Intel and AMD's optimization manuals go into some detail about the internal implementation, and it's nothing like what you're suggesting. Modern x86 CPUs rename the architectural registers onto much larger physical register files, enabling out-of-order execution without stalling for write-after-write or write-after-read data hazards. (See Why does mulss take only 3 cycles on Haswell, different from Agner's instruction tables? for more details about register renaming). See this answer for a basic intro to out-of-order exec, and a block diagram of an actual Haswell core. (And remember that a physical chip has multiple cores).

与简单或玩具微体系结构不同,几乎所有高性能CPU支持未命中和/或未命中(多个未决的高速缓存未命中,未完全阻塞等待第一个完成的内存操作)

Unlike a simple or toy microarchitecture, almost all high-performance CPUs support miss under miss and/or hit under miss (multiple outstanding cache misses, not totally blocking memory operations waiting for the first one to complete)

可以构建一个具有单个MBR / MDR的简单x86;如果原始的8086和386微体系结构在内部实现中具有类似的功能,我不会感到惊讶。

You could build a simple x86 that had a single MBR / MDR; I wouldn't be surprised if original 8086 and maybe 386 microarchitectures had something like that as part of the internal implementation.

但是例如Haswell或Skylake核心可以做到2从L1d缓存加载和每个周期1个存储(请参阅如何缓存可以这么快吗?)。显然,他们不能只有一个MBR。相反, Haswell有72个加载缓冲区条目和42个存储缓冲区条目,它们都是内存顺序缓冲区的一部分,它们支持无序执行加载/存储,同时保持以下幻觉:

But for example a Haswell or Skylake core can do 2 loads and 1 store per cycle from/to L1d cache (See How can cache be that fast?). Obviously they can't have just one MBR. Instead, Haswell has 72 load-buffer entries and 42 store-buffer entries, which all together are part of the Memory Order Buffer which supports out-of-order execution of loads / stores while maintaining the illusion that only StoreLoad reordering happens / is visible to other cores.

自P5 Pentium起,自然对齐的加载/存储最多可保证64位是原子的,但在此之前,只有32位访问是原子的。所以是的,如果386/486具有MDR,则它可能是32位。但是,即使是那些早期的CPU也会在CPU和RAM之间缓存。

Since P5 Pentium, naturally-aligned loads/stores up to 64 bits are guaranteed atomic, but before that only 32-bit accesses were atomic. So yes, if 386/486 had an MDR, it could have been 32 bits. But even those early CPUs could have cache between the CPU and RAM.

我们知道 Haswell和更高版本在L1d缓存和执行单元之间有256位路径,即32个字节,并且Skylake-AVX512具有用于ZMM加载/存储的64字节路径。 AMD CPU将矢量操作分成128位块,因此它们的加载/存储缓冲区条目大概只有16个字节宽。

We know that Haswell and later have a 256-bit path between L1d cache and execution units, i.e. 32 bytes, and Skylake-AVX512 has 64-byte paths for ZMM loads/stores. AMD CPUs split wide vector ops into 128-bit chunks, so their load/store buffer entries are presumably only 16 bytes wide.

Intel CPU至少将相邻存储合并到存储缓冲区中有相同的高速缓存行,并且还有10个LFB(行填充缓冲区)供L1d和L2(或内核外到L3或DRAM)之间的挂起传输。

Intel CPUs at least merge adjacent stores to the same cache line within the store buffer, and there are also the 10 LFBs (line-fill buffers) for pending transfers between L1d and L2 (or off-core to L3 or DRAM).

x86是可变长度的指令集;在前缀之后,最长的指令长于32位。即使对于8086,也是如此。例如,添加单词[bx + disp16],imm16 的长度为6个字节。但是8088只有一个4字节的预取队列可以进行解码(与8086的6字节队列相比),因此它必须支持解码指令,而无需从内存中加载整个内容。 8088/8086解码的前缀每次循环1个周期,而4个字节的操作码+ modRM绝对足以识别其余指令的长度,因此它可以对其进行解码,然后如果没有,则获取disp16和/或imm16。还没有。现代的x86可以具有更长的指令,尤其是在SSSE3 / SSE4需要很多强制性前缀作为操作码的一部分的情况下。

x86 is a variable-length instruction set; after prefixes, the longest instruction is longer than 32 bits. This was true even for 8086. For example, add word [bx+disp16], imm16 is 6 bytes long. But 8088 only had a 4-byte prefetch queue to decode from (vs. 8086's 6 byte queue), so it had to support decoding instructions without having loaded the whole thing from memory. 8088 / 8086 decoded prefixes 1 cycle at a time, and 4 bytes of opcode + modRM is definitely enough to identify the length of the rest of the instruction, so it could decode it and then fetch the disp16 and/or imm16 if they weren't fetched yet. Modern x86 can have much longer instructions, especially with SSSE3 / SSE4 requiring many mandatory prefixes as part of the opcode.

它也是CISC ISA,因此请保持内部实际的指令字节不是很有用;您不能像简单的MIPS那样直接将指令位用作内部控制信号。

在非流水线CPU中,可以可能是某个地方的单个物理EIP寄存器。对于现代CPU,每条指令都具有与其关联的EIP,但是许多指令同时在CPU内部运行。有序的流水线CPU可能会将EIP与每个阶段相关联,但无序的CPU将必须在每个指令的基础上对其进行跟踪。 (实际上是每个uop,因为复杂的指令会解码到1个以上的内部uop。)

In a non-pipelined CPU, yes there might be a single physical EIP register somewhere. For modern CPUs, each instruction has an EIP associated with it, but many are in flight at once inside the CPU. An in-order pipelined CPU might associate an EIP with each stage, but an out-of-order CPU would have to track it on a per-instruction basis. (Actually per uop, because complex instructions decode to more than 1 internal uop.)

现代x86会以16或32字节的块为单位进行读取和解码,最多可解码5或每个时钟周期有6条指令,并将解码结果放入队列,以使前端发出到内核的无序部分。

Modern x86 fetches and decodes in blocks of 16 or 32 bytes, decoding up to 5 or 6 instructions per clock cycle and placing the decode results in a queue for the front-end to issue into the out-of-order part of the core.

另请参见 https://stackoverflow.com/tags/x86/info 中的CPU内部链接,特别是David Kanter的文章和Agner Fog的微体系结构指南。

See also the CPU-internals links in https://stackoverflow.com/tags/x86/info, especially David Kanter's write-ups and Agner Fog's microarch guides.

BTW,您省去了x86的许多控制/调试寄存器。 CR0..4对于386启用保护模式,分页和其他各种功能至关重要。您只能使用GP和段寄存器以及EFLAGS在实模式下使用CPU,但是如果您包含操作系统需要管理的非通用寄存器,则x86具有更多的体系结构寄存器。

BTW, you left out x86's many control / debug registers. CR0..4 are critical for 386 to enable protected mode, paging, and various other stuff. You could use a CPU in real mode only using the GP and segment regs, and EFLAGS, but x86 has far more architectural registers if you include the non-general-purpose regs that the OS needs to manage.

这篇关于x86寄存器:MBR / MDR和指令寄存器的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆