Linux平板分配器和缓存性能 [英] Linux slab allocator and cache performance

查看:117
本文介绍了Linux平板分配器和缓存性能的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

根据指南了解Linux内核第3版,第8.2.10章,Slab着色-

From the guide understanding linux kernel 3rd edition, chapter 8.2.10, Slab coloring-

从第2章我们知道,同一条硬件高速缓存行映射了许多不同的RAM块.在这个 在本章中,我们还看到相同大小的对象最终以相同的偏移量存储在缓存中. 在不同平板中具有相同偏移量的对象将以较高的概率最终映射 在同一缓存行中.因此,缓存硬件可能会浪费存储周期来传输两个对象 从同一高速缓存行来回到不同的RAM位置,而其他高速缓存行未得到充分利用. 平板分配器试图通过称为平板着色的策略来减少这种令人不快的缓存行为: 将任意值(称为颜色)分配给平板.

We know from Chapter 2 that the same hardware cache line maps many different blocks of RAM. In this chapter, we have also seen that objects of the same size end up being stored at the same offset within a cache. Objects that have the same offset within different slabs will, with a relatively high probability, end up mapped in the same cache line. The cache hardware might therefore waste memory cycles transferring two objects from the same cache line back and forth to different RAM locations, while other cache lines go underutilized. The slab allocator tries to reduce this unpleasant cache behavior by a policy called slab coloring : different arbitrary values called colors are assigned to the slabs.

(1)我无法理解平板着色尝试解决的问题.当正常过程访问数据时,如果它不在高速缓存中,并且遇到高速缓存未命中,则数据将与数据的环绕地址中的数据一起被提取到高速缓存中,该过程将尝试访问该数据以提高性能.如何发生这样的情况,使得相同的特定缓存行不断交换?进程在两个不同存储区域的一个存储区域内以相同偏移量连续访问两个不同数据地址的概率非常低.即使发生这种情况,缓存策略通常也会根据某些议程(例如LRU,Random等)来选择要交换的行.不存在能够根据所访问地址的最低有效位中的匹配来选择逐出行的策略

(1) I am unable to understand the issue that the slab coloring tries to solve. When a normal proccess accesses data, if it is not in the cache and a cache miss is encountered, the data is fetched into the cache along with data from the surounding address of the data the process tries to access to boost performance. How can a situation occur such that same specific cache lines keeps getting swapped? the probability that a process keeps accessing two different data addresses in same offset inside a memory area of two different memory areas is very low. And even if it does happen, cache policies usually choose lines to be swapped according to some agenda such as LRU, Random, etc. No policy exist such that chooses to evict lines according to a match in the least significant bits of the addresses being accessed.

(2)我无法理解板坯着色的方法,即从板坯的末尾到开始要占用空闲字节并以不同的板坯具有不同的偏移量来生成第一个对象,如何解决缓存问题交换问题?

(2) I am unable to understand how the slab coloring, which takes free bytes from end of slab to the beginning and results with different slabs with different offsets for the first objects, solve the cache-swapping issue?

[已解决] ,我认为我已经找到了答案.答案已发布.

[SOLVED] after a small investigation I believe I found an answer to my question. Answer been posted.

推荐答案

我想我明白了,答案与 Associativity 有关.

I think I got it, the answer is related to Associativity.

一个缓存可以划分为某些集合,每个集合只能缓存其中的某个内存块类型.例如,set0将包含地址为8的倍数的内存块,set1将包含地址为12的倍数的内存块.其原因是为了提高缓存性能,以避免在整个缓存中搜索每个地址的情况. .这样,只需要搜索一组特定的缓存即可.

A cache can be divided to certain sets, each set can only cache a certain memory blocks type in it. For example, set0 will contain memory blocks with addresses of multiple of 8, set1 will contain memory blocks with addresses of multiple of 12. The reason for that is to boost cache performance, to avoid the situation where every address is searched throught the whole cache. This way only a certain set of the cache needs to be searched.

现在,通过链接了解CPU缓存和性能

Now, from the link Understanding CPU Caching and performance

从Henessey和Patterson的377页中,缓存的放置公式如下: (块地址)MOD(缓存中的套数)

From page 377 of Henessey and Patterson, the cache placement formula is as follows: (Block address) MOD (Number of sets in cache)

让我们获取存储块地址0x10000008(来自颜色C的slabX)和存储块地址0x20000009(颜色Z的slabY).对于大多数N(高速缓存中的集合数),<address> MOD <N>的计算将产生不同的值,因此将使用不同的集合来缓存数据.如果地址具有相同的最低有效位值(例如0x100000080x20000008),则对于大多数N,计算将得出相同的值,因此,各块将碰撞相同缓存集.

Lets take memory block address 0x10000008 (from slabX with color C) and memory block address 0x20000009 (from slabY with color Z). For most N (number of sets in cache), the calculation for <address> MOD <N> will yield a different value, hence a different set to cache the data. If the addresses were with same least significant bits values (for example 0x10000008 and 0x20000008) then for most of N the calculation will yield same value, hence the blocks will collide to the same cache set.

因此,通过为不同平板中的对象保留不同的偏移量(颜色),平板对象将有可能到达缓存中的不同集合,并且不会碰撞相同的集合,并提高了整体缓存性能.

So, by keeping an a different offset (colors) for the objects in different slabs, the slabs objects will potentially reach different sets in cache and will not collide to the same set, and overall cache performance is increased.

此外,如果缓存是直接映射的缓存,那么根据维基百科,

Furthermore, if the cache is a direct mapped one, then according to wikipedia, CPU Cache, no cache replacement policy exist and the modulu calculation yields the cache block to which the memory block will be stored:

直接映射的缓存 在这种缓存组织中,主内存中的每个位置只能进入缓存中的一个条目.因此,直接映射的缓存也可以称为单向关联"缓存.它没有这样的替换策略,因为没有选择要退出哪个缓存条目的内容.这意味着,如果两个位置映射到同一条目,则它们可能会不断相互淘汰.尽管更简单,但是直接映射的缓存需要比关联的缓存大得多才能提供可比的性能,并且它更加不可预测.令x为缓存中的块数,y为内存中的块数,而不是缓存中的块数,然后借助等式x = ymod_n进行映射.

Direct-mapped cache In this cache organization, each location in main memory can go in only one entry in the cache. Therefore, a direct-mapped cache can also be called a "one-way set associative" cache. It does not have a replacement policy as such, since there is no choice of which cache entry's contents to evict. This means that if two locations map to the same entry, they may continually knock each other out. Although simpler, a direct-mapped cache needs to be much larger than an associative one to give comparable performance, and it is more unpredictable. Let x be block number in cache, y be block number of memory, and nbe number of blocks in cache, then mapping is done with the help of the equation x = y mod n.

这篇关于Linux平板分配器和缓存性能的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆