即使RS尚未完全满,RESOURCE_STALLS.RS事件是否有可能发生? [英] Is it possible for the RESOURCE_STALLS.RS event to occur even when the RS is not completely full?

查看:128
本文介绍了即使RS尚未完全满,RESOURCE_STALLS.RS事件是否有可能发生?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

有关英特尔Broadwell的 RESOURCE_STALLS.RS 硬件性能事件的描述如下:

The description of the RESOURCE_STALLS.RS hardware performance event for Intel Broadwell is the following:


此事件计算由于保留站(RS)中缺少合格条目
而导致的停顿周期。这可能是由于RS溢出,或者是由于RS阵列的写端口分配
方案(由于每个RS条目具有两个写端口而不是四个写端口,因此由于RS阵列分配了
)导致的。不能使用,尽管RS并不是
满)。这计算了管道后端阻止从前端进行uop
交付的周期。

This event counts stall cycles caused by absence of eligible entries in the reservation station (RS). This may result from RS overflow, or from RS deallocation because of the RS array Write Port allocation scheme (each RS entry has two write ports instead of four. As a result, empty entries could not be used, although RS is not really full). This counts cycles that the pipeline backend blocked uop delivery from the front end.

这基本上表示有两种情况RS停止事件发生的位置:

This basically says that there are two situations where the RS stall event occurs:


  • 当RS的所有合格条目都被占用并且分配器

  • 由于只有两个写端口而发生 RS释放时,分配器也未停止。

  • When all of the eligible entries of the RS are occupied and the allocator is not stalled.
  • When "RS deallocation" occurs because there are only two write ports, and the allocator is not stalled.

在第一种情况下合格是什么意思?这是否意味着并非所有条目都可以被各种uops占用?因为我的理解是,在现代微体系结构中,任何种类的uop都可以使用任何条目。另外,什么是RS阵列写端口分配方案?即使不是所有条目都被占用,它如何导致RS停顿?这是否意味着Haswell中有四个写端口,但是Broadwell中现在只有两个?即使手册没有明确说明,这两种情况都适用于Skylake或Haswell吗?

What does "eligible" mean in the first situation? Does this mean that not all entries can be occupied by all kinds of uops? Because my understanding is that in modern microarchitectures any entry can be used by any kind of uop. Also what is RS array Write Port allocation scheme and how does it cause RS stalls even when not all entries are occupied? Does this mean that there were four write ports in Haswell but now there are only two in Broadwell? Do either of these two situations apply to Skylake or Haswell even though the manual does not explicitly say so?

推荐答案

是的,是 RESOURCE_STALLS 可能会在RS完全填满之前指示一个完整的RS。

Yes, it is possible for RESOURCE_STALLS to indicate a full RS before the RS is completely full.

当RS填满时,分配RS中的新uops变得不太理想,直到某个时候它可能会完全停顿,即使有些条目仍然存在。

As the RS becomes full, allocation of new uops into the RS becomes less ideal until at some point it may stall out entirely, even though some entries remain.

此外,并非所有RS条目都可供所有人使用说明。例如,在Haswell上,我观察到60个RS条目中只有30-32个可用于加载:例如,这些条目在支持uop重放方面可能是特殊的。在Skylake上,情况有所不同:任何类型的指令都无法使用整个RS: 97条目 RS实际上由用于ALU op的64条目RS组成,并且负载操作的33个入口RS。因此,除非有一些巧合,两个RS的全部97个条目都完全填满,否则它们很少会填满。

Furthermore, not all RS entries are available to all instructions. For example, on Haswell, I observe that only 30-32 of the 60 RS entries are available to loads: these entries may be special in they support uop replay, for example. On Skylake, the situation is different: the entire RS is not available to any type of instruction: rather, the "97 entry" RS is actually made up of a 64-entry RS for ALU ops, and a 33 entry RS for load ops. So the entire 97 entries of RS(es) will rarely be full, unless by some coincidence both fill up at exactly the same moment.

RESOURCE_STALLS.RS 事件(umask 0x4 )仅在RS的 ALU部分已满(或足够使op可以不分配)。对于负载RS(与Haswell中的ALU RS重叠,但与Skylake不重叠),相应的事件具有umask 0x40 。您可以将其与 perf 一起使用,作为’cpu / event = 0xa2,umask = 0x40,name = resource_stalls_memrs_full / 。尽管未记录有关Skylake的事件,但它们似乎运行良好(尽管使用umask 0x10 0x80 的事件非常有用

The RESOURCE_STALLS.RS event (umask 0x4) only triggers when a the "ALU" part of the RS is full (or full enough that an op can't allocate). For the load RS (which overlaps with the ALU RS in Haswell but not Skylake), the corresponding event has umask 0x40. You can use it with perf as 'cpu/event=0xa2,umask=0x40,name=resource_stalls_memrs_full/. Although the events are not documented for Skylake, they seem to work fine (although events with umasks 0x10 through 0x80 are very different than documented on Sandy Bridge.

未来的Intel芯片可能具有更细粒度的预留站。

Future Intel chips are likely to have even finer-grained reservation stations.

这篇关于即使RS尚未完全满,RESOURCE_STALLS.RS事件是否有可能发生?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆