Aarch64什么是后继转发? [英] Aarch64 what is late-forwarding?

查看:99
本文介绍了Aarch64什么是后继转发?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

后期转发"在"Arm Neoverse E1核心软件优化指南" 中(以及有关其他一些CPU型号的优化指南):

"Late-forwarding" is mentioned in "Arm Neoverse E1 Core Software Optimization Guide" (as well as in their optimization guides for some other CPU models):

<身体>
说明组说明 Exec Latency Exec吞吐量说明
乘法累加(32位) MADD,MSUB 3(2) 1 2
乘法累加(64位) MADD,MSUB 5(4) 1/3 2

(2)乘累加流水线支持类似μOP的累加操作数的后向转发,从而允许典型的乘累加μOP序列每N个周期发出一次(累积延迟N括号中所示).

(2) Multiply-accumulate pipelines support late-forwarding of accumulate operands from similar μOPs, allowing a typical sequence of multiply-accumulate μOPs to issue one every N cycles (accumulate latency N shown in parentheses).

术语延迟转发"是什么意思?吝啬的?哪些指令序列将被延迟转发(反例也将有所帮助)?

What does the term "late-forwarding" mean? What sequence of instructions would be subject to late-forwarding (counter-example would also be helpful)?

推荐答案

乘法加法运算的后期转发意味着可以在乘法完成后使加数可用,而不是在乘法加法开始时必须可用执行.由于乘法本身不依赖于加数的数据,因此可以继续进行.由于加法的一些工作可以与乘法并行进行(乘积的指数将尽早使用,并且可以与加数的指数一起使用以确定加法之前所需的移位量),因此可能希望加数为在整个产品可用之前就可以使用了,但是即使在这种情况下,也不需要加数,直到被乘数晚得多.

Late forwarding for multiply-add operations means that the addend can be made available after the multiplication has completed rather than having to be available when the multiply-add operation begins execution. Since the multiplication itself is not data dependent on the addend, it can proceed. Since some work for the addition can be done in parallel with the multiplication (the exponent of the product will be available early and can be used with the addend's exponent to determine the amount of shift needed before addition), one may want the addend to be available before the entire product is available, but even in that case the addend is not needed until much later than the multiplicands.

通过延迟加数的转发(可用性),减少了相关累加的有效等待时间.这减少了需要覆盖等待时间的累加寄存器(和并行性)的数量.

By delaying the forwarding (availability) of the addend, the effective latency of dependent accumulations is reduced. This reduces the number of accumulation registers (and parallelism) one needs to cover the latency.

这篇关于Aarch64什么是后继转发?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆