参考的地点是什么? [英] What is locality of reference?

查看:138
本文介绍了参考的地点是什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在理解参考地区时遇到问题.谁能帮助我了解含义和含义,

I am having problem in understanding locality of reference. Can anyone please help me out in understanding what it means and what is,

  • 参考的空间位置
  • 参考的时间地点

推荐答案

如果您的计算机装有超快内存,这将无关紧要.

This would not matter if your computer was filled with super-fast memory.

但是不幸的是,情况并非如此,计算机内存看起来像这样的 1 :

But unfortunately that's not the case and computer-memory looks something like this1:

+----------+
| CPU      |  <<-- Our beloved CPU, superfast and always hungry for more data.
+----------+
|L1 - Cache|  <<-- works at 100% of CPU speed (fast)
+----------+
|L2 - Cache|  <<-- works at 25% of CPU speed (medium)
+----+-----+
     |
     |      <<-- This thin wire is the memory bus, it has limited bandwidth.
+----+-----+  <<-- works at 10% of CPU speed.
| main-mem |  <<-- The main memory is big but slow (because we are cheap-skates)
+----------+
     |
     |     <<-- Even slower wire to the harddisk
+----+-----+
| harddisk | <<-- Works at 0,001% of CPU speed
+----------+ 

空间位置
在此图中,数据离CPU越近,CPU可以获取的速度就越快.
这与Spacial Locality有关.如果数据在内存中靠得很近,则具有局部性.
由于我们是RAM的廉价溜冰鞋并不是真正的随机存取,所以它实际上是Slow if random, less slow if accessed sequentially Access Memory SIRLSIAS-AM. DDR SDRAM为一个读或写命令传输整个32或64字节的突发.
这就是将相关数据保持在一起的明智之举,因此您可以顺序读取一堆数据并节省时间.

Spatial Locality
In this diagram, the closer data is to the CPU the faster the CPU can get at it.
This is related to Spacial Locality. Data has spacial locality if it is located close together in memory.
Because of the cheap-skates that we are RAM is not really Random Access, it is really Slow if random, less slow if accessed sequentially Access Memory SIRLSIAS-AM. DDR SDRAM transfers a whole burst of 32 or 64 bytes for one read or write command.
That is why it is smart to keep related data close together, so you can do a sequential read of a bunch of data and save time.

临时地点
数据保留在主内存中,但它不能保留在缓存中,否则缓存将不再有用.在缓存中只能找到最近使用的数据.旧数据被推出.
这与temporal locality有关.如果同时访问数据,则数据具有很强的时间局部性.
这很重要,因为如果项目A在缓存中(良好),则项目B(对A的时间局部性很强)也很可能也在缓存中.

Temporal locality
Data stays in main-memory, but it cannot stay in the cache, or the cache would stop being useful. Only the most recently used data can be found in the cache; old data gets pushed out.
This is related to temporal locality. Data has strong temporal locality if it is accessed at the same time.
This is important because if item A is in the cache (good) than Item B (with strong temporal locality to A) is very likely to also be in the cache.

脚注1:

这是一个简化的示例,它用百分比表示,但是为典型的CPU提供了正确的数量级提示.

This is a simplification with percentages made up for example purposes, but give you the right order-of-magnitude idea for typical CPUs.

实际上,延迟和带宽是独立的因素,对于离CPU较远的内存,延迟很难改善.但是在某些情况下,硬件预取和/或乱序执行程序可能会隐藏延迟,例如在数组上循环.使用不可预测的访问模式,有效的内存吞吐量可能远远低于L1d缓存的10%.

In reality latency and bandwidth are separate factors, with latency harder to improve for memory farther from the CPU. But HW prefetching and/or out-of-order exec can hide latency in some cases, like looping over an array. With unpredictable access patterns, effective memory throughput can be much lower than 10% of L1d cache.

此简化版本还省略了TLB效果(页面粒度局部性)和DRAM页面局部性. (与虚拟内存页面不同).要更深入地了解内存硬件及其调优软件,请参见每个程序员都有什么应该了解内存吗?

This simplified version also leaves out TLB effects (page-granularity locality) and DRAM-page locality. (Not the same thing as virtual memory pages). For a much deeper dive into memory hardware and tuning software for it, see What Every Programmer Should Know About Memory?

相关:

Related: Why is the size of L1 cache smaller than that of the L2 cache in most of the processors? explains why a multi-level cache hierarchy is necessary to get the combination of latency/bandwidth and capacity (and hit-rate) we want.

一个巨大的快速L1数据高速缓存会耗电惊人,甚至不可能像现代高性能CPU中的小型快速L1d高速缓存那样具有低延迟.

One huge fast L1-data cache would be prohibitively power-expensive, and still not even possible with as low latency as the small fast L1d cache in modern high-performance CPUs.

在多核CPU中,L1i/L1d和L2缓存通常是每核专用缓存,具有共享的L3缓存.不同的内核必须相互竞争L3和内存带宽,但是每个内核都有自己的L1和L2带宽.请参阅如何快得这么快?以获取来自双核3GHz IvyBridge CPU:两个核心上的L1d高速缓存总读取带宽为186 GB/s,而9.6 GB/s的DRAM读取带宽同时两个核心都处于活动状态. (因此,对于只有128位SIMD加载/存储数据路径的那一代台式机CPU,内存= 10%单核L1d是一个不错的带宽估计). L1d延迟为1.4 ns,而DRAM延迟为72 ns

In multi-core CPUs, L1i/L1d and L2 cache are typically per-core private caches, with a shared L3 cache. Different cores have to compete with each other for L3 and memory bandwidth, but each have their own L1 and L2 bandwidth. See How can cache be that fast? for a benchmark result from a dual-core 3GHz IvyBridge CPU: aggregate L1d cache read bandwidth on both cores of 186 GB/s vs. 9.6 GB/s DRAM read bandwidth with both cores active. (So memory = 10% L1d for single-core is a good bandwidth estimate for desktop CPUs of that generation, with only 128-bit SIMD load/store data paths). And L1d latency of 1.4 ns vs. DRAM latency of 72 ns

这篇关于参考的地点是什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆