英特尔x86处理器的L1内存缓存在哪里记录? [英] Where is the L1 memory cache of Intel x86 processors documented?

查看:132
本文介绍了英特尔x86处理器的L1内存缓存在哪里记录?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试分析和优化算法,我想了解高速缓存对各种处理器的特定影响。对于最新的Intel x86处理器(例如Q9300),很难找到有关缓存结构的详细信息。特别是,大多数发布处理器的网站(包括 Intel.com )规格不包括对L1缓存的任何引用。这是因为L1缓存不存在,还是由于某些原因此信息不重要?是否有任何有关消除L1缓存的文章或讨论?

I am trying to profile and optimize algorithms and I would like to understand the specific impact of the caches on various processors. For recent Intel x86 processors (e.g. Q9300), it is very hard to find detailed information about cache structure. In particular, most web sites (including Intel.com) that post processor specs do not include any reference to L1 cache. Is this because the L1 cache does not exist or is this information for some reason considered unimportant? Are there any articles or discussions about the elimination of the L1 cache?

[edit]
运行各种测试和诊断程序后(大多数在答案中进行了讨论)如下),我得出的结论是我的Q9300似乎具有32K L1数据缓存。关于为何很难获得这些信息,我仍然没有找到明确的解释。我目前的工作原理是L1缓存的详细信息现在已被Intel视为商业秘密。

[edit] After running various tests and diagnostic programs (mostly those discussed in the answers below), I have concluded that my Q9300 seems to have a 32K L1 data cache. I still haven't found a clear explanation as to why this information is so difficult to come by. My current working theory is that the details of L1 caching are now being treated as trade secrets by Intel.

推荐答案

几乎是不可能的查找有关英特尔缓存的规格。去年我在缓存课程上教书时,我问过Intel内部的朋友(属于编译器组),他们他们找不到规范。

It is near impossible to find specs on Intel caches. When I was teaching a class on caches last year, I asked friends inside Intel (in the compiler group) and they couldn't find specs.

但是等等! Jed ,祝福他的灵魂,告诉我们,在Linux系统上,您可以从内核中挤出大量信息:

But wait!!! Jed, bless his soul, tells us that on Linux systems, you can squeeze lots of information out of the kernel:

grep . /sys/devices/system/cpu/cpu0/cache/index*/*

这将为您提供关联性,设置大小和其他信息(但不是延迟)。
例如,我了解到,尽管AMD广告了他们的128K L1缓存,但我的AMD机器有一个分别为64K的I和D缓存。

This will give you associativity, set size, and a bunch of other information (but not latency). For example, I learned that although AMD advertises their 128K L1 cache, my AMD machine has a split I and D cache of 64K each.

由于杰德(Jed),两个建议现在已经过时了:

Two suggestions which are now mostly obsolete thanks to Jed:


  • AMD发布了更多信息有关其缓存的信息,因此您至少可以获得一些有关现代缓存的信息。例如,去年的AMD L1高速缓存每个周期(峰值)发送了两个单词。

  • AMD publishes a lot more information about its caches, so you can at least got some information about a modern cache. For example, last year's AMD L1 caches delivered two words per cycle (peak).

开源工具 valgrind 里面有各种各样的缓存模型,对于分析和理解缓存行为来说,它是无价的。它带有一个非常好的可视化工具 kcachegrind ,它是KDE SDK的一部分。

The open-source tool valgrind has all sorts of cache models inside it, and it is invaluable for profiling and understanding cache behavior. It comes with a very nice visualization tool kcachegrind which is part of the KDE SDK.

例如:在2008年第三季度,AMD K8 / K10 CPU使用64字节缓存行,每个L1I / L1D拆分缓存具有64kB的空间。 L1D是2路关联的,并且与L2互斥,延迟为3个周期。 L2高速缓存是16路关联的,延迟约为12个周期。

For example: in Q3 2008, AMD K8/K10 CPUs use 64 byte cache lines, with a 64kB each L1I/L1D split cache. L1D is 2-way associative and exclusive with L2, with latency of 3 cycles. L2 cache is 16-way associative and latency is about 12 cycles.

AMD Bulldozer系列CPU 使用具有每个集群16kiB 4路关联L1D(每个核心2个)的拆分L1。

AMD Bulldozer-family CPUs use a split L1 with a 16kiB 4-way associative L1D per cluster (2 per core).

Intel CPU长期以来一直保持L1不变(从Pentium M到 Haswell 到Skylake,大概是此后的许多世代):每个I和D缓存32kB,L1D是8路关联的。 64字节高速缓存行,与DDR DRAM的突发传输大小匹配。负载使用延迟约为4个周期。

Intel CPUs have kept L1 the same for a long time (from Pentium M to Haswell to Skylake, and presumably many generations after that): Split 32kB each I and D caches, with L1D being 8-way associative. 64 byte cache lines, matching the burst-transfer size of DDR DRAM. Load-use latency is ~4 cycles.

另请参见标记维基的问题,以获取更多性能和微体系结构数据的链接。

Also see the x86 tag wiki for links to more performance and microarchitectural data.

这篇关于英特尔x86处理器的L1内存缓存在哪里记录?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆