什么是延迟槽的意义呢? [英] What is the point of delay slots?

查看:2172
本文介绍了什么是延迟槽的意义呢?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

所以从我的延迟槽的了解,他们在一个分支指令被调用,并后的下一条指令的分支也将被从内存中加载发生。这样做有什么意义呢?难道你指望code后一个分支不中的情况下运行的分支被采用?难道它来保存的情况下所采取的分支心不是时候?

So from my understanding of delay slots, they occur when a branch instruction is called and the next instruction following the branch also gets loaded from memory. What is the point of this? Wouldn't you expect the code after a branch not to run in case the branch is taken? Is it to save time in case the branch isnt taken?

我在看一个管道示意图,它似乎分支越来越开展反正之后的指令。

I am looking at a pipeline diagram and it seems the instruction after branch is getting carried out anyway..

推荐答案

大多数处理器这些天使用管道。从H&安培的想法和问题,P书(S)到处使用。在这些原始的文字的时候,我将承担实际的硬件相匹配的管道的特定概念。取,脱code,执行,写回。

Most processors these days use pipelines. The ideas and problems from the H&P book(s) are used everywhere. At the time of those original writings, I would assume the actual hardware matched that particular notion of a pipeline. fetch, decode, execute, write back.

基本上是一个管道是一条流水线,并在该行四个主要阶段,所以你最多有四条指令可以在曾经工作过的。它混淆了它需要多少个时钟执行一条指令的概念,以及它需要多于一个时钟,但如果你在并行一些/许多执行那么平均可能接近或超过每时钟之一。

Basically a pipeline is an assembly line, with four main stages in the line, so you have at most four instructions be worked on at once. Which confuses the notion of how many clocks does it take to execute an instruction, well it takes more than one clock, but if you have some/many executing in parallel then the "average" can approach or exceed one per clock.

当你把一个分支,虽然装配线失败。在指令提取和德code阶段必须扔,你必须重新开始填充,所以你需要在几个时钟的一击来获取,德code,然后回到执行。分支阴影或延迟槽的想法是要收回这些时钟之一。如果您声明一个分支之后的指令总是执行,那么当一个分支,该指令在德code插槽也被执行,在取插槽指令被丢弃,你有时间不是两个一个孔。因此,而不是执行,空,空,执行,执行你现在有执行,执行,空,执行,执行......在流水线的执行阶段。分支为50%,痛苦少,你的整体平均执行速度提高等。

When you take a branch though the assembly line fails. The instructions in the fetch and decode stage have to be tossed, and you have to start filling again, so you take a hit of a few clocks to fetch, decode, then back to executing. The idea of the branch shadow or delay slot is to recover one of those clocks. If you declare that the instruction after a branch is always executed then when a branch is taken the instruction in the decode slot also gets executed, the instruction in the fetch slot is discarded and you have one hole of time not two. So instead of execute, empty, empty, execute, execute you now have execute, execute, empty, execute, execute... in the execute stage of the pipeline. The branch is 50% less painful, your overall average execution speed improves, etc.

ARM没有延迟槽,但它给管道的假象,以及,宣称该程序计数器是提前两个指令。依赖于程序计数器的任何操作(pc相对地址)必须计算使用的是PC是未来两个指令,对于ARM指令,这是原来的拇指4个字节,当你向它就会变得混乱thumb2说明添加8个字节偏移量。

ARM does not have a delay slot, but it gives the illusion of a pipeline as well, by declaring that the program counter is two instructions ahead. Any operation that relies on the program counter (pc-relative addressing) must compute the offset using a pc that is two instructions ahead, for ARM instructions this is 8 bytes for original thumb 4 bytes and when you add in thumb2 instructions it gets messy.

这是在这一点外学者,管线越深,有很多的技巧等,才能让传统code继续工作,和/或不必重新定义指令各是如何工作的幻想结构变化(想象MIPS修订版X,1延迟槽,转Y 2延迟槽,转Z 3插槽,如果条件和2个插槽,如果条件b和1插槽,如果条件c)处理器继续运行并分支后执行的第一条指令,并作为其重新填充管后丢弃其他少数或打。多深的管道确实在通常不与公众分享。

These are illusions at this point outside academics, the pipelines are deeper, have lots of tricks, etc, in order for legacy code to keep working, and/or not having to re-define how instructions work for each architecture change (imagine mips rev x, 1 delay slot, rev y 2 delay slots, rev z 3 slots if condition a and 2 slots if condition b and 1 slot if condition c) the processor goes ahead and executes the first instruction after a branch, and discards the other handful or dozen after as it re-fills the pipe. How deep the pipes really are is often not shared with the public.

我看到了有关这是一个RISC的事情评论,它可能已经开始有,但CISC处理器使用完全相同的招数,只是让原有指令集的错觉,有时在CISC处理器是不超过RISC或更多VLIW内核的包装来模拟传统的CISC指令集(微codeD)。

I saw a comment about this being a RISC thing, it may have started there but CISC processors use the same exact tricks, just giving the illusion of the legacy instruction set, at times the CISC processor is no more than a RISC or VLIW core with a wrapper to emulate the legacy CISC instruction set (microcoded).

观看如何将其制作节目。可视化组装线,在该行的每一步都有一个任务。如果其中有一步线跑出蓝色whatsits,并且使您需要的蓝色whatsits蓝色和黄色产物。而且因为有人搞砸了,你不能让一个星期新的蓝色whatsits。所以,你必须停止生产线,改变耗材每个阶段,使红色和绿色产品的同时,通常会被妥善分阶段没有倾销线。这就是像什么用树枝发生,在流水线的某个地方深的东西使这一行必须改变,转储线。延迟槽是从具有在该行被丢弃回收一种产品的方法。替代N产品出来行停止之前,N + 1的产品就出来了每生产运行。 code的执行是类似生产运行的突发状况,你经常会得到短,有时长,创下了分支前线性执行路径去另一个短的执行路径,另一支短的执行路径...

Watch the how its made show. Visualize an assembly line, each step in the line has a task. What if one step in the line ran out of blue whatsits, and to make the blue and yellow product you need the blue whatsits. And you cant get new blue whatsits for another week because someone screwed up. So you have to stop the line, change the supplies to each stage, and make the red and green product for a while, which normally could have been properly phased in without dumping the line. That is like what happens with a branch, somewhere deep in the assembly line, something causes the line to have to change, dump the line. the delay slot is a way to recover one product from having to be discarded in the line. Instead of N products coming out before the line stopped, N+1 products came out per production run. Execution of code is like bursts of production runs, you often get short, sometimes long, linear execution paths before hitting a branch to go to another short execution path, branch another short execution path...

这篇关于什么是延迟槽的意义呢?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆