是否有任何架构将相同的寄存器空间用于标量整数和浮点运算? [英] Is there any architecture that uses the same register space for scalar integer and floating point operations?

查看:70
本文介绍了是否有任何架构将相同的寄存器空间用于标量整数和浮点运算?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我见过大多数支持本地标量硬件FP支持的体系结构,将它们推入了与主要寄存器集完全分开的完全独立的寄存器空间。

Most architectures I've seen that support native scalar hardware FP support shove them off into a completely separate register space, separate from the main set of registers.

大多数我见过支持本机标量硬件FP支持的体系结构,将它们推到了与主要寄存器组分开的完全独立的寄存器空间中。

Most architectures I've seen that support native scalar hardware FP support shove them off into a completely separate register space, separate from the main set of registers.

  • X86's legacy x87 FPU uses a partially separate floating-point "stack machine" (read: basically a fixed-size 8-item ring buffer) with registers st(0) through st(7) to index each item. This is probably the most different of the popular ones. It can only interact with other registers through load/store to memory, or by sending compare results to EFLAGS. (286 fnstsw ax, and i686 fcomi).
  • FPU-enabled ARM has a separate FP register space that works similarly to its integer space. The primary difference is a separate instruction set specialized for floating-point, but even the idioms mostly align.
  • MIPS is somewhere in between, in that floating point is technically done through a coprocessor (at least visibly) and it has slightly different rules surrounding usage (like doubles using two floating-point registers rather than single extended registers), but they otherwise work fairly similarly to ARM.
  • X86's newer SSE scalar instructions operate similarly to their vector instructions, using similar mnemonics, and idioms. It can freely load and store to standard registers and to memory, and you can use a 64-bit memory reference as an operand for many scalar operations like addsd xmm1, m64 or subsd xmm1, m64, but you can only load from and store to registers via movq xmm1, r/m64, movq r/m64, xmm1, and friends. This is similar to ARM64 NEON, although it's slightly different from ARM's standard scalar instruction set.

相反,许多矢量化指令甚至都没有烦恼这种区别,只是在标量和矢量之间作了区分。对于x86,ARM和MIPS,这三个都是:

Conversely, many vectorized instructions don't even bother with this distinction, just drawing a distinction between scalar and vector. In the case of x86, ARM, and MIPS all three:


  • 它们将标量和向量寄存器空间分开。

  • 它们重复使用相同的寄存器空间进行矢量化整数和浮点运算。

  • 它们仍然可以根据需要访问整数堆栈。

  • 标量运算只是从相关的寄存器空间(或在x86 FP常量的情况下为内存)中提取标量。

  • They separate the scalar and vector register spaces.
  • They reuse the same register space for vectorized integer and floating-point operations.
  • They can still access the integer stack as applicable.
  • Scalar operations simply pull their scalars from the relevant register space (or memory in the case of x86 FP constants).

但是我想知道:是否有任何CPU架构将相同的寄存器空间用于整数和浮点运算?

如果不是(由于兼容性以外的原因),是什么会阻止硬件设计人员选择走这条路线?

And if not (due to reasons beyond compatibility), what would be preventing hardware designers from choosing to go that route?

推荐答案

Motorola 88100具有用于浮点数和整数值的单个寄存器文件(31个32位条目加上一个硬连线的零寄存器)。由于具有32位寄存器并支持双精度,因此必须使用寄存器对来提供值,从而极大地限制了可以保存在寄存器中的双精度值的数量。

The Motorola 88100 had a single register file (thirty-one 32-bit entries plus a hardwired zero register) used for floating point and integer values. With 32-bit registers and support for double precision, register pairs had to be used to supply values, significantly constraining the number of double precision values that could be kept in registers.

后续的88110增加了32个80位扩展寄存器,用于附加(和更大)浮点值。

The follow-on 88110 added thirty-two 80-bit extended registers for additional (and larger) floating point values.

Mitch Alsup,曾参与摩托罗拉88k开发,已经开发了自己的负载存储ISA(至少部分是出于教学上的原因),如果我没记错的话,它使用统一的寄存器文件。

Mitch Alsup, who was involved in Motorola's 88k development, has developed his own load-store ISA (at least partially for didactic reasons) which, if I recall correctly, uses a unified register file.

Power ISA(来自PowerPC的后代)定义了一个嵌入式浮点工具,该工具使用GPR作为浮点值。这样可以减少核心实现成本和上下文切换开销。

It should also be noted that the Power ISA (descendant from PowerPC) defines an "Embedded Floating Point Facility" which uses GPRs for floating point values. This reduces core implementation cost and context switch overhead.

单独的寄存器文件的一个好处是,它提供了显式的存储功能,以减少简单的有限超标量设计中的寄存器端口数(例如, ,为每个文件提供三个读取端口将允许所有对的一个FP,甚至三个源操作数FMADD,以及一个基于GPR的操作并行开始,以及许多常见的基于GPR的操作对,而只有五个读取端口:单寄存器文件以支持FMADD和其他两个源操作)。另一个因素是容量是额外的,宽度是独立的;这既有优点也有缺点。另外,通过将存储与操作耦合,可以以更直接的方式实现高度不同的协处理器。在给定芯片尺寸限制的情况下,这对于早期微处理器而言更为重要,但是UltraSPARC T1共享具有八个核心的浮点单元,而AMD的Bulldozer共享具有两个整数核心的FP / SIMD单元。

One benefit of separate register files is that such provides explicit banking to reduce register port count in a straightforward limited superscalar design (e.g., providing three read ports to each file would allow all pairs of one FP, even three-source-operand FMADD, and one GPR-based operation to start in parallel and many common pairs of GPR-based operations compared with a five read ports with single register file to support FMADD and one other two-source operation). Another factor is that the capacity is additional and the width independent; this has both advantages and disadvantages. In addition, by coupling storage with operations a highly distinct coprocessor can be implemented in a more straightforward manner. This was more significant for early microprocessors given chip size limits, but the UltraSPARC T1 shared a floating point unit with eight cores and AMD's Bulldozer shared an FP/SIMD unit with two integer "cores".

统一的寄存器文件具有一些调用约定的优点;无论值的类型如何,都可以将这些值传递到相同的寄存器中。统一的寄存器文件还允许所有寄存器用于所有操作,从而减少了不可用的资源。

A unified register file has some calling convention advantages; values can be passed in the same registers regardless of the type of the values. A unified register file also reduces unusable resources by allowing all registers to be used for all operations.

这篇关于是否有任何架构将相同的寄存器空间用于标量整数和浮点运算?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆