MIPS,流水线和分支延迟时隙的示例 [英] Example with MIPS, Pipelining and Branch Delay Slot

查看:468
本文介绍了MIPS,流水线和分支延迟时隙的示例的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在准备测试,并有这样的例子.以下代码:

I am preparing for a test and have such example. Following code:

1: SLL $1, $1, 2
2: LW $2, 1000($1)
3: BEQL $2, $0, END
4: ADDI $3, $2, 1
5: MULT $3, $2
6: MFLO $4
END:
7: J QUIT
...
QUIT:
100: NOP

在RISC处理器(带有准MIPS指令集)上使用

is executed on RISC processor (with quasi MIPS instruction set) with

  • 五阶段管道
  • 绕过
  • 动态计划
  • 分支延迟时段
  • 另外我们知道,不会被分行
  • five-stage pipeline
  • no bypassing
  • no dynamic scheduling
  • Branch Delay Slot
  • Additionally we know, that branch won't be taken

我的任务是了解分支延迟槽"在这种情况下的工作方式,并构建正确的管线图.

My task is to understand how the Branch Delay Slot works in this situation and build the correct Pipeline Diagram.

我有一个官方的解决方案,它给出了以下图表,没有任何解释:

I have an official solution and it gives the following diagram with no explanation:

1: SLL $1, $1, 2         IDEMW  
2: LW $2, 1000($1)        I---DEMW  
3: BEQL $2, $0, END           I---DEMW  
4: ADDI $3, $2, 1                 IDx
5: MULT $3, $2                       IDEMW
6: MFLO $4                            I---DEMW

据我了解,ADDI在Branch Delay Slot中执行并停止 在处理器理解后,没有采取该分支,这导致我们得出错误的结果.我的问题在这里

As far as I understand, ADDI is executed in Branch Delay Slot and is stopped after processor understands, that branch is not taken, what leads us to wrong result. My Questions here are

  • 我说的对吗?
  • 是的时候,为什么在分支延迟槽中执行ADDI而不跳转?

推荐答案

CPU继续按顺序读取指令,即在执行过程中(已经获取,解码并正在处理其余阶段,我不知道您的确切阶段,所以这只是beql的一般描述),它将使管道的另一部分可以自由获取下一条指令,但是分支尚未完成,因此PC仍指向分支->之后的下一条指令那就是分支延迟槽".

The CPU keeps reading instructions sequentially, i.e. during execution (was already fetched, decoded and the remaining phases are now being processed, I don't know your exact phases, so this is just general description) of beql it will get the other part of pipe free to fetch next instruction, but the branch was not finished yet, so the PC is still pointing to the next instruction after branch -> that's the "branch delay slot".

在经典MIPS上,此下一条指令将被提取,解码和执行,同时分支可能会或可能不会将PC修改为分支目标,因此分支延迟插槽指令将每次执行.仅在未发生分支时才执行该指令之后的下一条指令,即PC顺序在分支延迟槽"位置之后继续执行.如果分支确实修改了PC,则fetch + decode将引起注意并解码来自新目的地的下一条指令,因此在经典MIPS上,分支延迟时隙只有1条指令大"(我不知道是否更复杂) MIPS CPU可以具有更多的级数和更多的可用延迟槽,从技术上讲,它具有5级流水线甚至5条指令,可能会产生延迟的声音,但实际上可能很难使用,而且听起来可能会产生更多的问题而不是帮助.)

On classic MIPS this next instruction is fetched, decoded, and executed, and meanwhile the branch may or may not modify the PC to the branch target, so the branch-delay slot instruction will get executed every time. The next instruction after it gets executed only when branching didn't happen, i.e. the PC continues after the "branch delay slot" position sequentially. In case the branch did modify the PC, the fetch+decode will take notice and decode the next instruction from new destination, so on classic MIPS the branch delay slot is only 1 instruction "big" (I have no idea if more complex MIPS CPUs can have more stages and more delay slots available, technically with 5 stage pipeline even 5 instructions delayed sounds HW possible, but it would be probably very difficult to use practically and sounds like it would create more problems than help).

BEQL是更复杂的指令,如果分支条件失败,则会在执行中途终止延迟插槽指令.

The BEQL is more complex instruction, killing the delay slot instruction halfway into execution if the branching condition fails.

请参见 http://math-atlas.sourceforge.net/devel/assembly/mips-iv.pdf 第45页以获取BEQL的详细说明.

See http://math-atlas.sourceforge.net/devel/assembly/mips-iv.pdf page 45 for detailed description of BEQL.

因此,"NullifyCurrentInstruction()"可能是图中的"x".剩下的图,我只是在猜测,因为我没有研究您的5个阶段的详细信息,但是在获取和解码(?)之后的第二个LW发现它取决于$1,因此它在依赖阶段等待上一个指令W阶段.等等... ADDI不依赖任何内容,因此它几乎与BEQL并行执行,并在最后被杀死.

So that "NullifyCurrentInstruction()" is probably that "x" in the diagram. Remaining things in diagram, I'm just guessing as I didn't study your 5 stage details, but second LW after fetching and decoding(?) finds out it depends on $1, so it waits in depend-stage for previous instruction W phase. Etc...The ADDI doesn't depend on anything, so it is executed almost parallel to BEQL, and gets killed toward the end.

但是我不明白为什么每次释放"I"阶段时都没有"I"阶段,看起来像"I"正在等待某件事,最后您最多希望在该阶段执行2条指令同时.

But I don't understand why there's no "I" phase every time the "I" stage gets freed, looks like the "I" waits for something and in the end you have like at most 2 instructions going on at the same time.

无论如何,如果不研究问题中使用的CPU的技术细节,这是很难理解的,而且我也不想研究它,我什至不知道您拥有哪种CPU,以及从何处获得它.技术文档.

Anyway, this is quite undecipherable without studying the technical details of the CPU used in your question, and I don't want to study it, I'm not even sure what kind of CPU you have, and where to get it's technical documentation.

我还将尝试在此处提取pdf的相关部分,以使此答案不是仅链接",但复制pdf可能会很棘手...

edit: I will try to extract relevant part of pdf also here, to make this answer not "just link", but copying pdf may be tricky...

BEQL MIPS IV CPU的说明文档:

BEQL instruction docs of MIPS IV CPU:

说明: 如果(rs = rt),则branch_likely
在分支延迟槽中,将一个18位带符号偏移量(该16位偏移量字段左移2位)添加到分支后的指令地址(而不是分支本身),以形成PC相对有效目标地址.
如果GPR rs和GPR rt的内容相等,则在执行延迟槽中的指令后分支到目标地址.如果未执行转移,则不执行延迟槽中的指令.

Description: if (rs = rt) then branch_likely
An 18-bit signed offset (the 16-bit offset field shifted left 2 bits) is added to the address of the instruction following the branch (not the branch itself), in the branch delay slot, to form a PC-relative effective target address.
If the contents of GPR rs and GPR rt are equal, branch to the target address after the instruction in the delay slot is executed. If the branch is not taken, the instruction in the delay slot is not executed.

操作:
我:
   tgt_offset←sign_extend(offset || 0 2 )
  条件←(GPR [rs] = GPR [rt])
I + 1:
如果条件则
   PC←PC + tgt_offset
其他
   NullifyCurrentInstruction()
结束

Operation:
I:
  tgt_offset ← sign_extend(offset || 02)
  condition ← (GPR[rs] = GPR[rt])
I+1:
if condition then
  PC ← PC + tgt_offset
else
  NullifyCurrentInstruction()
endif

这篇关于MIPS,流水线和分支延迟时隙的示例的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆