如果长时间闲置,缓存是否会清空自身? [英] Does Cache empty itself if idle for a long time?

查看:46
本文介绍了如果长时间闲置,缓存是否会清空自身?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

如果在阈值时间内未遇到任何指令,缓存是否会刷新自身?

Does cache memory refresh itself if doesn't encounter any instruction for a threshold amount of time?

我的意思是说,假设我有一台多核计算机,并且其上有隔离的内核.现在,对于其中一个核心,几秒钟没有任何活动.在这种情况下,经过一定时间后,是否会刷新指令缓存中的最后一条指令?

What I mean is that suppose, I have a multi-core machine and I have isolated core on it. Now, for one of the cores, there was no activity for say a few seconds. In this case, will the last instructions from the instruction cache be flushed after a certain amount of time has passed?

我了解这可能取决于体系结构,但是我正在寻找有关该概念的一般指导.

I understand this can be architecture dependent but I am looking for general pointers on the concept.

推荐答案

如果在特定的空闲状态下对高速缓存进行电源门控,并且使用易失性内存技术(例如SRAM)来实现,则高速缓存将丢失其内容.在这种情况下,为了保持架构状态,必须将所有脏行写入将保留其状态的某种内存结构(例如内存层次结构的下一个级别).大多数处理器支持电源门控空闲状态.例如,在Intel处理器上,在核心C6和更深的状态下,该核心已完全通电,包括所有专用缓存.当内核从这些状态中的任何一个唤醒时,缓存将变冷.

If a cache is power-gated in a particular idle state and if it's implemented using a volatile memory technology (such as SRAM), the cache will lose its contents. In this case, to maintain the architectural state, all dirty lines must be written to some memory structure that will retain its state (such as the next level of the memory hierarchy). Most processors support power-gating idle states. For example, on Intel processors, in the core C6 and deeper states, the core is fully power-gated including all private caches. When the core wakes up from any of these states, the caches will be cold.

在空闲状态下,为了节省电源,刷新缓存而不是对其进行电源门控可能很有用.ACPI规范在第8.1.4节(版本6.3)中定义了这种状态,称为C3:

It can be useful in an idle state, for the purpose of saving power, to flush a cache but not power-gate it. The ACPI specification defines such a state, called C3, in Section 8.1.4 (of version 6.3):

处于C3状态时,处理器的缓存保持状态,但是侦听总线主控器或多处理器CPU不需要处理器访问内存.

While in the C3 state, the processor’s caches maintain state but the processor is not required to snoop bus master or multiprocessor CPU accesses to memory.

在同一部分中,它详细说明了C3不需要保留缓存状态,也不需要刷新它.本质上,ACPI C3中的内核不能保证高速缓存的一致性.在ACPI C3的实现中,要么要求系统软件在内核进入C3之前手动刷新缓存,要么硬件将采用某种机制来确保一致性(刷新不是唯一的方法).与较浅的状态相比,该空闲状态可以不必参与缓存一致性,从而可以节省更多的电量.

Later in the same section it elaborates that C3 doesn't require preserving the state of caches, but also doesn't require flushing it. Essentially, a core in ACPI C3 doesn't guarantee cache coherence. In an implementation of ACPI C3, either the system software would be required to manually flush the cache before having a core enter C3 or the hardware would employ some mechanism to ensure coherence (flushing is not the only way). This idle state can potentially save more power compared to a shallower states by not having to engage in cache coherence.

据我所知,唯一实现非电源门控版本ACPI C3的处理器是Intel的处理器,从Pentium II开始.所有现有的Intel x86处理器都可以根据它们实现ACPI C3的方式进行分类:

To the best of my knowledge, the only processors that implement a non-power-gating version of ACPI C3 are those from Intel, starting with the Pentium II. All existing Intel x86 processors can be categorized according to how they implement ACPI C3:

  • Intel Core和更高版本以及Bonnell和更高版本:硬件状态称为C3.该实现使用多种节能机制.与该问题有关的那个可能会通过在进入空闲状态时执行微代码例程来刷新所有核心缓存(指令,数据,uop,分页单元).也就是说,所有脏行都写回到内存层次结构(L2或L3)的最接近共享级别,并且所有有效的干净行均无效.这就是维护高速缓存一致性的方式.其余的核心状态将保留.
  • 奔腾II,奔腾III,奔腾4和奔腾M:在这些处理器中,硬件状态称为睡眠".在休眠状态下,处理器完全处于时钟门控状态,并且不响应侦听(除其他外).不会清除片上高速缓存,并且硬件也未提供保护有效行不连贯的另一种机制.因此,系统软件负责确保高速缓存的一致性.否则,英特尔指定如果正在进入或退出睡眠模式或已经进入睡眠模式的处理器发出探听请求,则结果将是不可预测的.
  • 所有其他人都不支持ACPI C3.
  • Intel Core and later and Bonnell and later: The hardware state is called C3. The implementation uses multiple power-reduction mechanisms. The one relevant to the question flushes all the core caches (instruction, data, uop, paging unit), probably by executing a microcode routine on entry to the idle state. That is, all dirty lines are written back to the closest shared level of the memory hierarchy (L2 or L3) and all valid clean lines are invalidated. This is how cache coherency is maintained. The rest of the core state is retained.
  • Pentium II, Pentium III, Pentium 4, and Pentium M: The hardware state is called Sleep in these processors. In the Sleep state, the processor is fully clock-gated and doesn't respond to snoops (among other things). On-chip caches are not flushed and the hardware doesn't provide an alternative mechanism that protects the valid lines from becoming incoherent. Therefore, the system software is responsible for ensuring cache coherence. Otherwise, Intel specifies that if a snoop request occurs to a processor that is transitioning into or out of Sleep or already in Sleep, the resulting behavior is unpredictable.
  • All others don't support ACPI C3.

请注意,时钟门控可通过以下方式节省功率:

Note that clock-gating saves power by:

  • 关闭本身会消耗功率的时钟生成逻辑.
  • 关闭在每个时钟周期执行任何操作的任何逻辑.

使用时钟门控时,动态功率基本上降低为零.但是仍然需要消耗静态功率来维持易失性存储结构中的状态.

With clock-gating, dynamic power is reduced to essentially zero. But static power is still consumed to maintain state in the volatile memory structures.

许多处理器包括至少一个在多个内核之间共享的片上高速缓存级别.处理器品牌Core Solo和Core Duo(基于增强的Pentium M或Core微体系结构)引入了一个空闲状态,该状态在程序包级别实现ACPI C3,共享缓存可以逐渐进行功率控制和恢复(英特尔的程序包级别).状态对应于ACPI规范中的系统级状态).根据处理器的不同,此硬件状态称为PC7,增强的深度睡眠状态,深度C4或其他名称.与专用缓存相比,共享缓存要大得多,因此完全刷新将花费更多时间.这会降低PC7的有效性.因此,它会逐渐刷新(进入CC7的程序包的最后一个核心将执行此操作).此外,当程序包退出PC7时,也会逐渐启用共享缓存,这可以降低下次进入PC7的成本.这是基本思想,但是细节取决于处理器.在PC7中,封装的大部分都经过了电源门控.

Many processors include at least one level of on-chip cache that is shared between multiple cores. The processor branded Core Solo and Core Duo (whether based on the Enhanced Pentium M or Core microarchitectures) introduced an idle state that implements ACPI C3 at the package-level where the shared cache may be gradually power-gate and restore (Intel's package-level states correspond to system-level states in the ACPI specification). This hardware state is called PC7, Enhanced Deeper Sleep State, Deep C4, or other names depending on the processor. The shared cache is much larger compared to the private caches, and so it would take much more time to fully flush. This can reduce the effectiveness of PC7. Therefore, it's flushed gradually (the last core of the package that enters CC7 performs this operation). In addition, when the package exits PC7, the shared cache is enabled gradually as well, which may reduce the cost of entering PC7 next time. This is the basic idea, but the details depend on the processor. In PC7, significant portions of the package are power-gated.

这篇关于如果长时间闲置,缓存是否会清空自身?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆