什么是数据缓存和指令缓存意味着什么? [英] What is meant by data cache and instruction cache?

查看:1777
本文介绍了什么是数据缓存和指令缓存意味着什么?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

的rel=\"nofollow\">:


  

指令和数据具有不同的访问模式,并获得
  不同的区域的存储器。因此,具有两个相同的缓存
  指令和数据可能并不总是工作了。


  
  

因此​​,这是相当常见的有两个高速缓存:指令高速缓存
  只存储指令和数据高速缓冲存储器只存储数据。


这是直观的了解指令和数据之间的区别,但我现在不能确定显示在这方面的差异?什么构成数据和被放入一个数据缓存和什么构成的指令和被放入指令缓存?

我知道ARM汇编。请问什么要求 STR LDR LDMF STMFD 使用数据缓存?但是从技术上来讲 STR LDR LDMF STMFD 都说明,所以我这就是为什么我很困惑。必须数据总是与指令的存在?被认为数据在。数据部分东西吗?

例如 LDR R1,= myVar的然后将LDR进入指令缓存和myVar的内容进入数据缓存?或者它不是那样的工作?


  

指令和数据具有不同的访问模式
  可能有人请详细点吗?


本评论对我有帮助的岗位上做出了突出我的难处理解:


  

我们的想法是,如果一条指令已经从存储器加载,这是
  可能很快再次被使用,但只有这样,才能知道下一个
  指令是读它。这意味着读存储器(你不能说
  它已经在缓存中,因为一个新指令被红色)。所以我
  还是不明白这一点?说LDR指令就这样发生了,所以现在
  LDR是在数据高速缓存中。也许另一个LDR指令会发生,
  也许不会,我们不能确定,所以我们就来实际读取下一个
  指令 - 从而打败高速缓存的目的



解决方案

取指令可以在块做的假设,大部分时间你都在经历连续​​许多指令来运行。所以取指可以更有效,有可能是一个或少数每笔交易的开销多钟那么对于内存的延迟有数据准备好,然后每总线宽度的时钟的交易规模。 8个字或指令可能是说5 + N + 8个时钟,例如,比在时间(5 + 1 + 1)* 8一个指令更有效。

数据,另一方面它不是一个假设,良好的数据将被读取顺序多的时间,因此额外的周期可能会伤害,只能取要求(到内存或总线的宽度数据是一个免费搭乘)。

在我了解的L1缓存I和D是分开的武器,L2它们结合起来。 L1是不是AXI / AMBA总线上,并且可能更加有效的比L2访问和超越这是AMBA / AXI(架空几个周期加上时间加上每个数据总线宽度一个时钟对每一项交易)。

有关被标记为可高速缓存(如果MMU是上)L1和作为结果的L2将获取的高速缓存线,而不是用于数据的单个项目和也许比我的数据的取量更多的指令的地址空间取。

每个的LDR和LDM指令要导致数据周期中,可以在该地址是可高速缓存进入L2和L1高速缓存是否已不存在。指令本身也如果在一个可缓存地址将进入L2和L1高速缓存是否已不存在。 (是有很多旋钮来控制的缓存,而不是,不要想进入这些细微之处,只是承担讨论的缘故所有这些指令的取指和数据访问是高速缓存)。

您想保存刚才在高速缓存的情况下执行的指令,你有一个循环,或再次运行code。也跟随在缓存行中的指令将受益的更有效的访问所保存的开销。但是如果仅通过高速缓存线的一个很小的比例执行,然后整体这些周期是一种浪费,并且如果这种情况发生过多那么缓存使事情更慢。

在东西是在高速缓存然后它被读(或根据设置写)的高速缓存副本是所使用的一个,而不是在较慢存储器的副本的下一次。最终(取决于设置),如果有的项的高速缓存拷贝已经被因修饰以一个写(STR,STM)和一些新的访问需要被保存在高速缓存再一个旧的被驱逐回缓存储器和从一个写慢内存缓存发生。你没有这个问题的说明,说明基本上都是只读的,所以你不要有他们写回缓的内存,理论上缓存副本和缓慢的内存拷贝都是一样的。

  LDR R1,= MYVAR

将导致PC相对负载

  LDR R1,东西
...
东西:.word MYVAR

LDR指令将一个高速缓存行的一部分取,取指令(与一帮更多的指令一起)。这些将被保存在一个手臂上的L1缓存和L2的共用部分(如果启用等)我一部分。当最后执行的指令,然后在地址的东西将经历一个数据读取,而如果缓存在该领域启用了读取那么它也将进入L2和L1高速缓存(D部分),如果不是已经存在。
如果你周围循环并再次运行该指令就那么理想情况下,指令将在L1缓存和访问时间获取速度非常快的时钟总屈指可数。数据也将在L1高速缓存,也将是时钟读取的少数

5 + N + 8我如上所述,一些数目的开销时钟(5仅仅是一种可能性,它可以既通过设计和通过什么其他并联回事变化)。 N个依赖于较慢的内存速度。即n是DRAM相当大,所以高速缓存12和L1得多快很多,这就是为什么缓存有在所有减少大量的时钟周期的每一个DRAM存取,效率或不

From here:

Instructions and data have different access patterns, and access different regions of memory. Thus, having the same cache for both instructions and data may not always work out.

Thus, it's rather common to have two caches: an instruction cache that only stores instructions, and a data cache that only stores data.

It's intuitive to know the distinction between instructions and data, but now I'm not show sure of the difference in this context? What constitutes as data and gets put into a data cache and what constitutes as instructions and gets put into an instruction cache?

I know ARM assembly. Would anything requiring STR, LDR, LDMF or STMFD use the data cache? But technically speaking STR, LDR, LDMF and STMFD are all instructions so I this is why I'm confused. Must "data" always exist with an "instruction"? Is data considered anything in the .data section?

For example LDR R1, =myVar then would LDR go into the instruction cache and the contents of myVar go into the data cache? Or does it not work like that?

Instructions and data have different access patterns Could someone please elaborate?

This comment I made on a helpful post highlights my difficulty understanding:

"The idea is that if an instruction has been loaded from memory, it's likely to be used again soon" but the only way to know the next instruction is to read it. That means a memory read (you can't say it's already in cache because a new instruction is being red). So I still don't see the point? Say a LDR instruction just happened, so now LDR is in the data cache. Maybe another LDR instruction will happen, maybe it won't, we can't be sure so we have to actually read the next instruction - thus defeating the purpose of cache.

解决方案

Instruction fetches can be done in chunks with the assumption that much of the time you are going to run through many instructions in a row. so instruction fetches can be more efficient, there is likely a handful or more clocks of overhead per transaction then the delay for the memory to have the data ready then a clock per width of the bus for the size of the transaction. 8 words or instructions might be say 5+n+8 clocks for example, that is more efficient than one instruction at a time (5+1+1)*8.

Data on the other hand it is not that good of an assumption that data will be read sequentially much of the time so additional cycles can hurt, only fetch the data asked for (u p to the width of the memory or bus as that is a freebie).

On the ARMs I know about the L1 cache I and D are separate, L2 they are combined. L1 is not on the axi/amba bus and is likely more efficient of an access than the L2 and beyond which are amba/axi (a few cycles of overhead plus time plus one clock per bus width of data for every transaction).

For address spaces that are marked as cacheable (if the mmu is on) the L1 and as a result L2 will fetch a cache line instead of the individual item for data and perhaps more than a fetch amount of I data for an instruction fetch.

Each of your ldr and ldm instruction are going to result in data cycles that can if the address is cacheable go into the L2 and L1 caches if not already there. the instruction itself also if at a cacheable address will go into the L2 and L1 caches if not already there. (yes there are lots of knobs to control what is cacheable and not, dont want to get into those nuances, just assume for sake of the discussion all of these instruction fetches and data accesses are cacheable).

You would want to save instructions just executed in the cache in case you have a loop or run that code again. Also the instructions that follow in the cache line will benefit from the saved overhead of the more efficient access. but if you only execute through a small percentage of the cache line then overall those cycles are a waste, and if that happens too much then the cache made things slower.

Once something is in a cache then the next time it is read (or written depending on the settings) the cache copy is the one that is used, not the copy in slow memory. Eventually (depending on settings) if the cache copy of some item has been modified due to a write (str, stm) and some new access needs to be saved in the cache then an old one is evicted back to slow memory and a write from the cache to slow memory happens. You dont have this problem with instructions, instructions are basically read-only so you dont have to write them back to slow memory, in theory the cache copy and the slow memory copy are the same.

ldr r1,=myvar

will result in a pc relative load

ldr r1,something
...
something: .word myvar

the ldr instruction will be part of a cache line fetch, an instruction fetch (along with a bunch more instructions). these will be saved in I part of the L1 cache on an arm and the shared part of L2 (if enabled, etc). When that instruction is finally executed then the address for something will experience a data read, which if caching is enabled in that area for that read then it will also go into the L2 and L1 cache (D part) if not already there. If you loop around and run that instruction again right away then ideally the instruction will be in the L1 cache and the access time to fetch it is very fast a handful of clocks total. The data also will be in the L1 cache and will also be a handful of clocks to read.

The 5+n+8 I mentioned above, some number of clocks of overhead (5 is just a possibility, it can vary both by the design and by what else is going on in parallel). the N depends on the slower memory speeds. that n is quite large for dram, so the caches l2 and L1 are much much faster, and that is why the cache is there at all to reduce the large number of clock cycles for every dram access, efficient or not.

这篇关于什么是数据缓存和指令缓存意味着什么?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆