文件支持的内存映射的CPU缓存行为/策略? [英] CPU cache behaviour/policy for file-backed memory mappings?

查看:131
本文介绍了文件支持的内存映射的CPU缓存行为/策略?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

是否有人知道在现代x86系统上将哪种类型的CPU缓存行为或策略(例如,不可缓存的写合并)分配给了内存映射的文件支持区域?

Does anyone know which type of CPU cache behaviour or policy (e.g. uncacheable write-combining) is assigned to memory mapped file-backed regions on modern x86 systems?

有什么方法可以检测出哪种情况,并且有可能覆盖默认行为吗?

Is there any way to detect which is the case, and possibly override the default behaviour?

Windows和Linux是主要的操作系统.

Windows and Linux are the main operating systems of interest.

(编者注:该问题以前的表述为内存映射的I/O ,但是该短语具有不同的特定技术含义,尤其是在谈论CPU缓存时,即与负载/存储进行对话的实际I/O设备(如NIC或视频卡).

(Editor's note: the question was previously phrased as memory mapped I/O, but that phrase has a different specific technical meaning, especially when talking about CPU caches. i.e. actual I/O devices like NICs or video cards that you talk to with loads / stores.

这个问题实际上是关于当您不使用MAP_ANONYMOUS并且由磁盘上的常规文件支持时,您从mmap(some_fd, ...)获得什么样的内存.)

This question is actually about what kind of memory you get from mmap(some_fd, ...), when you don't use MAP_ANONYMOUS and it's backed by a regular file on disk.)

推荐答案

TL:DR内存映射文件对映射到进程地址空间的页面缓存页面使用常规的回写策略.如果您想要非WB的页面,则必须做一些特殊且特定于操作系统的事情.

TL:DR Memory mapped files use the normal Write-Back policy for pages of the pagecache that they map into the address space of your process. You have to do something special and OS-specific if you ever want pages that aren't WB.

应用于地址空间区域的缓存策略通常与操作系统无关,并且仅取决于地址空间页面后面的设备类型.实际上,操作系统可以自由地将任何缓存策略应用于任何内存区域,但是分配错误的缓存策略会降低系统性能或破坏系统逻辑.

Caching policy applied to the address space region is generally operating system independent and depends only on the type of device behind the address space page. In fact, the operating system is free to apply any caching policy to any memory region, but incorrectly assigned caching policy can reduce system performance or broke system logic at all.

至少有四个缓存策略:

  1. 完整缓存(回写,又称WB). 应用于映射到主存储器(RAM)的物理地址空间.用于提高内存子系统性能的性能.这种设备的主要特性是其状态只能由软件更改,并且只能影响软件.

  1. Full caching (write-back, aka WB). Applied to the physical address space mapped to the main memory (RAM). Used to increase the performance of memory subsystem performance. The main property of such devices is that its state can be changed only by software and can affect only software.

内存映射文件的实现使用完整缓存,因为它们完全由软件(操作系统)实现,该软件从磁盘读取文件块并将其放置在内存中,然后将该块(可能已修改)放回磁盘.硬件会更新页表中的脏"位,以使操作系统确定需要同步到磁盘的内容.

The memory mapped files implementation use full caching because they implemented completely by software (operating system) that read file chunk from disk and place it memory and then put this chunk (possibly modified) back to disk. Hardware updates a "dirty" bit in the page tables to let the OS figure out what needs to be synced to disk.

直写式缓存. (WT) 这种设备的主要特性是只能通过软件来更改其状态,但是更改必须立即对设备产生影响.根据此策略,写入内存映射IO设备寄存器的数据将同时放置在两个位置:缓存和设备中.但是,一旦启动读取数据,就可以从缓存中捕获数据,而无需昂贵地访问设备.

Write-through caching. (WT) The main property of such devices is that its state can be changed only by software, but the change must have an immediate effect on the device. According to this policy, data written to the memory-mapped IO device register will be placed in two places concurrently: in the cache and in the device. But when the data read will be initiated, data will be captured from the cache without expensive access to the device.

此高速缓存策略对于不写入内存,仅读取CPU写入内容的MMIO设备很有用.实际上,它很少用于任何用途. GPU不是那样的,并且确实会写视频内存,因此它不用于视频RAM. (由于GPU不是CPU的缓存一致性域的一部分,因此GPU没有使该区域的CPU缓存无效的机制)

This cache policy could be useful for a MMIO device that doesn't write its memory, only reads what the CPU wrote. In practice it's rarely used for anything. GPUs aren't like that, and do write video memory, so it's not used for video RAM. (There's no mechanism for the GPU to invalidate CPU caches of the region, because the GPU isn't part of the CPU's cache-coherency domain)

  1. 对内存映射的IO设备寄存器的写入将被延迟,直到高速缓存控制器决定用写入的数据刷新高速缓存行的时刻为止.结果,驱动程序将无法知道写入设备的命令何时生效.
  2. 可以缓存从内存映射的IO设备寄存器读取的数据.从同一内存映射的IO设备寄存器读取的后续数据可以返回的不是设备的实际数据,而是来自缓存的过时数据.因此,驱动程序很难捕获设备的实际状态.

由于软件指定缓存策略的方式仅取决于处理器,因此可以在任何操作系统中应用相同的算法. 最简单的方法是捕获CR3寄存器的内容,并使用它找到您想了解该缓存策略的地址所对应的页表项,并检查PCD和PWT标志.但是这种方法并不完整,因为很少有其他功能会影响缓存(例如,可以完全禁用CR0上的缓存,另请参见MTRR,PAT).

Due to the fact that the way by which software can specify caching policy is only processor dependent the same algorithm can be applied in any operating system. The simplest way is to capture the content of the CR3 register, and using it locate the Page Table Entry appropriate to the address which caching policy you want to know and check the PCD and PWT flags. But this way isn't complete because there are few other features that can affect caching (for example, caching can be completely disabled on CR0, see also MTRR, PAT).

这篇关于文件支持的内存映射的CPU缓存行为/策略?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆