写入时如何精确复制 [英] How exactly does copy on write work

查看:96
本文介绍了写入时如何精确复制的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

假设我们有一个特定的父进程,并在内存中存储了任意数量的数据,然后使用fork产生一个子进程.我知道,为了使操作系统能够在写入时执行复制,内存中包含我们正在修改的数据的特定页面将设置其只读位,并且操作系统将使用在子进程尝试时会导致的异常修改数据以将整个页面复制到内存中的另一个区域,以便子代获得其自己的副本.我不明白的是,如果内存中的特定部分被标记为只读,那么数据最初所属的父级将无法修改数据.那么整个方案如何运作?父级是否会丢失其数据的所有权,并且即使父级本身尝试修改数据也必须执行写入时的复制操作?

解决方案

对,如果任何一个进程写了​​一个COW页面,都会触发页面错误.

在页面错误处理程序中,如果该页面应为可写,它将分配一个新的物理页面并执行一个memcpy(newpage, shared_page, pagesize),然后更新发生故障的任何进程的页面表以映射该页面.到该虚拟地址的新页面.然后返回到用户空间,以重新执行存储指令.

这对于像fork这样的东西是一个胜利,因为一个进程通常在触摸了一页(堆栈存储器)后通常立即进行execve系统调用. execve破坏了该进程的所有内存映射,有效地将其替换为新进程.父级再次拥有每个页面的唯一副本. (除了已经 写入时复制的页面,例如,分配有mmap的内存通常被COW映射到单个零的物理页面,因此读取可能会命中L1d缓存.)

fork的智能优化是实际上复制包含堆栈顶部的页面,但仍然假定子进程通常会立即execve因此将其引用删除到所有其他页面.但是,仍然需要在父级中使用TLB无效,才能将所有页面临时翻转为只读并返回.

Say we have a certain parent process with some arbitrary amount of data stored in memory and we use fork to spawn a child process. I understand that in order for the OS to perform copy on write, the certain page in memory that contains the data that we are modifying will have its Read-only bit set, and the OS will use the exception that will result when the child tries to modify the data to copy the entire page into another area in memory so that the child gets it's own copy. What I don't understand is, if that specific section in memory is marked as Read-only, then the parent, to whom the data originally belonged, would not be able to modify the data neither. So how can this whole scheme work? Does the parent lose ownership of its data and copy on write will have to be performed even when the parent itself tries to modify the data?

解决方案

Right, if either process writes a COW page, it triggers a page fault.

In the page fault handler, if the page is supposed to be writeable, it allocates a new physical page and does a memcpy(newpage, shared_page, pagesize), then updates the page table of whichever process faulted to map the newpage to that virtual address. Then returns to user-space for the store instruction to re-run.

This is a win for something like fork, because one process typically makes an execve system call right away, after touching typically one page (of stack memory). execve destroys all memory mappings for that process, effectively replacing it with a new process. The parent once again has the only copy of every page. (Except pages that were already copy-on-write, e.g. memory allocated with mmap is typically COW-mapped to a single physical page of zeros, so reads can hit in L1d cache).

A smart optimization would be for fork to actually copy the page containing the top of the stack, but still do lazy COW for all the other pages, on the assumption that the child process will normally execve right away and thus drop its references to all the other pages. It still costs a TLB invalidation in the parent to temporarily flip all the pages to read-only and back, though.

这篇关于写入时如何精确复制的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆