为什么ARM PC寄存器指向下一条要执行的指令之后? [英] Why does the ARM PC register point to the instruction after the next one to be executed?

查看:32
本文介绍了为什么ARM PC寄存器指向下一条要执行的指令之后?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

根据 ARM IC.

在ARM状态下,PC的值是当前指令的地址加上8个字节.

In ARM state, the value of the PC is the address of the current instruction plus 8 bytes.

在拇指状态:

  • 对于 B、BL、CBNZ 和 CBZ 指令,PC 的值是当前指令的地址加上 4 个字节.
  • 对于所有其他使用标签的指令,PC 的值是当前指令的地址加上 4 个字节,结果的 bit[1] 清零以使其字对齐.

简单地说,PC寄存器的值指向下一条指令之后的指令.这是我不明白的.通常(特别是在 x86 上)程序计数器寄存器用于指向下一条要执行的指令的地址.

Simply saying, the value of the PC register points to the instruction after the next instruction. This is the thing I don't get. Usually (particularly on the x86) program counter register is used to point to the address of the next instruction to be executed.

那么,这背后的前提是什么?可能是条件执行?

So, what are the premises underlying that? Conditional execution, maybe?

推荐答案

这是一个令人讨厌的遗留抽象泄漏.

It's a nasty bit of legacy abstraction leakage.

最初的 ARM 设计有一个 3 级流水线(获取-解码-执行).为了简化设计,他们选择将 PC 读取为当前在取指地址线上的值,而不是 2 个周期前当前正在执行的指令的值.由于大多数 PC 相关地址是在链接时计算的,因此让汇编器/链接器补偿该 2 指令偏移比设计所有逻辑来纠正"PC 寄存器更容易.

The original ARM design had a 3-stage pipeline (fetch-decode-execute). To simplify the design they chose to have the PC read as the value currently on the instruction fetch address lines, rather than that of the currently executing instruction from 2 cycles ago. Since most PC-relative addresses are calculated at link time, it's easier to have the assembler/linker compensate for that 2-instruction offset than to design all the logic to 'correct' the PC register.

当然,这一切都牢牢地放在30 年前有意义的事情"堆上.现在想象一下,在今天的 15+ 阶段、多问题、无序流水线中,要在该寄存器中保持有意义的值需要什么,您可能会理解为什么现在很难找到一个认为将 PC 暴露为注册是个好主意.

Of course, that's all firmly on the "things that made sense 30 years ago" pile. Now imagine what it takes to keep a meaningful value in that register on today's 15+ stage, multiple-issue, out-of-order pipelines, and you might appreciate why it's hard to find a CPU designer these days who thinks exposing the PC as a register is a good idea.

不过,从好的方面来说,至少它不像 延迟槽那么可怕.相反,与您想象的相反,有条件地执行每条指令实际上只是围绕该预取偏移量的另一种优化.当围绕条件代码进行分支时(或者仍然像疯子一样执行管道中剩下的任何内容),您可以完全避免非常短的分支,而不是总是不得不采取管道刷新延迟;管道保持忙碌状态,当标志不匹配* 时,解码后的指令可以作为 NOP 执行.同样,这些天我们有有效的分支预测器,它最终更像是一个障碍而不是帮助,但对于 1985 年来说它很酷.

Still, on the upside, at least it's not quite as horrible as delay slots. Instead, contrary to what you suppose, having every instruction execute conditionally was really just another optimisation around that prefetch offset. Rather than always having to take pipeline flush delays when branching around conditional code (or still executing whatever's left in the pipe like a crazy person), you can avoid very short branches entirely; the pipeline stays busy, and the decoded instructions can just execute as NOPs when the flags don't match*. Again, these days we have effective branch predictors and it ends up being more of a hindrance than a help, but for 1985 it was cool.

* "...地球上 NOP 最多的指令集."

这篇关于为什么ARM PC寄存器指向下一条要执行的指令之后?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆