ARM Linux 如何模拟 PTE 的脏位、访问位和文件位? [英] How does ARM Linux emulate the dirty, accessed, and file bits of a PTE?

查看:25
本文介绍了ARM Linux 如何模拟 PTE 的脏位、访问位和文件位?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

根据 pgtable-2-level.h,ARM Linux有两个版本的PTE;Linux PTE 和 H/W PTE.Linux PTE 存储在 1024 字节以下的偏移量.

As per pgtable-2-level.h, ARM Linux has two version of PTE; The Linux PTE and H/W PTE. Linux PTE are stored on below a offset of 1024 bytes.

handle_pte_fault 中处理页面错误时,pte_filepte_mkdirtypte_mkyoung 等各种函数,使用版本 H/W PTE.

When handling page fault in handle_pte_fault various function like pte_file, pte_mkdirty, pte_mkyoung, get invoke with the version H/W PTE.

但实际上 ARM H/W 不支持其 PTE 中的脏位、访问位和文件位.

But actually ARM H/W does not support the dirty, accessed and file bit in its PTE.

我的问题是它如何检查 H/W PTE 页面的脏的、访问的、文件位?理想情况下,它应该检查 Linux PTE 上存储在 1024 字节以下偏移量的那些位?

My question is how does it check the dirty, accessed, file bit of a page on H/W PTE? Ideally it should check those bit on Linux PTE which are stored below an offset of 1024 bytes?

推荐答案

我的问题是它如何在 H/W PTE 上检查页面的脏、已访问、文件位?

My question is how does it check the dirty, accessed, file bit of a page on H/W PTE?

TL;DR - 通过在初始访问时出现页面错误来模拟它们.

TL;DR - they are emulated by taking a page fault on initial accesses.

答案在pgtable-2-level.h,

脏"位仅通过授予硬件写权限来模拟如果页面在 Linux PTE 中被标记为可写"和脏".这个意味着写入干净页面会导致权限错误,并且Linux MM 层将通过 handle_pte_fault() 将页面标记为脏.为了让硬件注意到权限变化,TLB 条目必须被刷新,而 ptep_set_access_flags() 会为我们做到这一点.

The "dirty" bit is emulated by only granting hardware write permission iff the page is marked "writable" and "dirty" in the Linux PTE. This means that a write to a clean page will cause a permission fault, and the Linux MM layer will mark the page dirty via handle_pte_fault(). For the hardware to notice the permission change, the TLB entry must be flushed, and ptep_set_access_flags() does that for us.

为例,页面的初始 MMU 映射被标记为只读.当一个进程写入它时,会产生一个页面错误.这是引用的handle_pte_fault,主要代码在fault.c 作为 do_page_fault 并将调用最终结束的通用 handle_mm_faulthandle_pte_fault.你可以看到代码,

To take the dirty case, the initial MMU mappings for page are marked read-only. When a process writes to it, a page fault is generated. This is the handle_pte_fault referenced and the main code is in fault.c as do_page_fault and will call the generic handle_mm_fault which eventually ends at handle_pte_fault. You can see the code,

if (flags & FAULT_FLAG_WRITE) {
        if (!pte_write(entry))
            return do_wp_page(mm, vma, address,
                    pte, pmd, ptl, entry);
        entry = pte_mkdirty(entry);  /** Here is the dirty emulation. **/
}

所以Linux通用代码会检查页面的权限,认为它是可写的并调用pte_mkdirty将页面标记为脏;整个过程通过故障处理程序启动或模拟.在 Linux PTE 中将页面标记为 dirty 后,ARM PTE 被标记为可写,因此后续写入不会导致故障.

So the Linux generic code will examine the permission of the page, see it is suppose to be writeable and call the pte_mkdirty to mark the page as dirty; the whole process is kicked off or emulated through the fault handler. After the page is marked dirty in the Linux PTE, the ARM PTE is marked as writeable so subsequent writes do not cause a fault.

accessed 是相同的,只是读取和写入最初都会出错.file 位也完全未映射,当发生故障时,会咨询 Linux PTE 以查看它是由文件支持还是完全未映射页错误.

accessed is identical only both read and write will initially fault. A file bit is also completely unmapped and when a fault occurs, the Linux PTE is consulted to see if it is backed by a file or is it a completely unmapped page fault.

在用新的权限更新硬件表并完成簿记后,用户态程序在故障指令处重新启动,除了处理故障的时间间隔外,它不会注意到差异.

After the hardware table is updated with new permissions and book keeping is done, the user mode program is restarted at the faulting instruction and it would not notice the difference, besides the time interval to handle the fault.

ARM Linux 使用 4k 页,而 ARM 二级页表的大小为 1k(256 个条目 * 4 字节).来自 pgtable-2-level.h 评论,

ARM Linux uses 4k pages and the ARM 2nd level page tables are 1k in size (256 entries * 4bytes). From the pgtable-2-level.h comments,

因此,我们稍微调整了实现——我们告诉 Linux 我们在第一级有 2048 个条目,每个条目是 8 个字节(嗯,两个硬件指针指向第二级.)第二级包含两个硬件 PTE 表连续排列,前面是包含 Linux 需要的状态信息的 Linux 版本.因此,我们最终在PTE"级别有 512 个条目.

Therefore, we tweak the implementation slightly - we tell Linux that we have 2048 entries in the first level, each of which is 8 bytes (iow, two hardware pointers to the second level.) The second level contains two hardware PTE tables arranged contiguously, preceded by Linux versions which contain the state information Linux needs. We, therefore, end up with 512 entries in the "PTE" level.

为了使用完整的 4K 页面,PTE 条目的结构如下,

In order to use the full 4K page, the PTE entries are structured like,

  1. Linux PTE [n]
  2. Linux PTE [n+1]
  3. ARM PTE [n]
  4. ARM PTE [n+1]

完整的 4k 页面有四个 1k 项目.这些页面集合必须按进程管理,以便为每个进程提供独特的内存视图,并且共享某些信息以节省实际 RAM.函数 cpu_set_pte_ext 用于更改物理 ARM 条目.由于每个 ARM CPU 修订版使用的表结构和功能略有不同,处理器函数表,指向一个汇编程序.例如,cpu_v7_set_pte_ext 是 ARMv7 或典型的原始 Cortex CPU 实现.该例程负责检查 Linux 标志并相应地更新硬件位.可以看出,r3 在该例程结束时被写入 pte+2048(从 Linux PTE 到硬件 PTE 的偏移量).汇编宏 proc-marcos.S 中的nofollow noreferrer">armv3_set_pte_ext.

Four 1k items for a full 4k page. These collections of pages must be managed per process to give each a unique view of memory and some information is shared in order to conserve real RAM. The function cpu_set_pte_ext is used to change the physical ARM entries. As each ARM CPU revision uses slightly different tables structures and features, there is an entry in the processor function table that points to an assembler routine. For instance, cpu_v7_set_pte_ext is the ARMv7 or typical original Cortex CPU implementation. This routine is responsible for examining the Linux flags and updating the hardware bits accordingly. As can be seen, the r3 is written to pte+2048 (offset from Linux PTE to hardware PTE) at the end of this routine. The assembler macro armv3_set_pte_ext in proc-marcos.S is used by many of the older CPU variants.

请参阅:Tim 关于 ARM MM 的笔记
       ARM 的 Linux 内核中的页表条目 (PTE) 描述符

这篇关于ARM Linux 如何模拟 PTE 的脏位、访问位和文件位?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆