什么时候应该使用 mmap 进行文件访问? [英] When should I use mmap for file access?

查看:12
本文介绍了什么时候应该使用 mmap 进行文件访问?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

POSIX 环境提供了至少两种访问文件的方式.有标准的系统调用 open()read()write() 和朋友,但也有使用 mmap() 将文件映射到虚拟内存中.

POSIX environments provide at least two ways of accessing files. There's the standard system calls open(), read(), write(), and friends, but there's also the option of using mmap() to map the file into virtual memory.

什么时候使用一种比另一种更可取?包括两个接口的优点是什么?

When is it preferable to use one over the other? What're their individual advantages that merit including two interfaces?

推荐答案

mmap 如果您有多个进程以只读方式从同一个文件访问数据(这在我编写的服务器系统类型中很常见),那将非常有用.mmap 允许所有这些进程共享相同的物理内存页面,从而节省大量内存.

mmap is great if you have multiple processes accessing data in a read only fashion from the same file, which is common in the kind of server systems I write. mmap allows all those processes to share the same physical memory pages, saving a lot of memory.

mmap 还允许操作系统优化分页操作.例如,考虑两个程序;程序 A1MB 文件读入使用 malloc 创建的缓冲区,程序 B mmaps 1MB文件存入内存.如果操作系统必须将 A 的部分内存交换出去,它必须先将缓冲区的内容写入交换,然后才能重用内存.在 B 的情况下,任何未修改的 mmap 页面都可以立即重用,因为操作系统知道如何从现有文件中恢复它们 mmap 来自.(操作系统可以通过最初将可写的 mmap 页面标记为只读并捕获 seg faults,类似于Copy on Write 策略).

mmap also allows the operating system to optimize paging operations. For example, consider two programs; program A which reads in a 1MB file into a buffer creating with malloc, and program B which mmaps the 1MB file into memory. If the operating system has to swap part of A's memory out, it must write the contents of the buffer to swap before it can reuse the memory. In B's case any unmodified mmap'd pages can be reused immediately because the OS knows how to restore them from the existing file they were mmap'd from. (The OS can detect which pages are unmodified by initially marking writable mmap'd pages as read only and catching seg faults, similar to Copy on Write strategy).

mmap 对于进程间通信也很有用.您可以在需要通信的进程中mmap一个文件作为读/写,然后在mmap'd区域中使用同步原语(这就是MAP_HASSEMAPHORE 标志用于).

mmap is also useful for inter process communication. You can mmap a file as read / write in the processes that need to communicate and then use synchronization primitives in the mmap'd region (this is what the MAP_HASSEMAPHORE flag is for).

mmap 一个地方可能很尴尬,如果您需要在 32 位机器上处理非常大的文件.这是因为 mmap 必须在进程的地址空间中找到一个连续的地址块,该地址块足够大以适应被映射文件的整个范围.如果您的地址空间变得碎片化,这可能会成为一个问题,您可能有 2 GB 的可用地址空间,但它的单个范围无法容纳 1 GB 的文件映射.在这种情况下,您可能需要将文件映射为比您想要的更小的块.

One place mmap can be awkward is if you need to work with very large files on a 32 bit machine. This is because mmap has to find a contiguous block of addresses in your process's address space that is large enough to fit the entire range of the file being mapped. This can become a problem if your address space becomes fragmented, where you might have 2 GB of address space free, but no individual range of it can fit a 1 GB file mapping. In this case you may have to map the file in smaller chunks than you would like to make it fit.

mmap 替代读/写的另一个潜在尴尬是您必须在页面大小的偏移量上开始映射.如果您只想在偏移量 X 处获取一些数据,则需要修复该偏移量,使其与 mmap 兼容.

Another potential awkwardness with mmap as a replacement for read / write is that you have to start your mapping on offsets of the page size. If you just want to get some data at offset X you will need to fixup that offset so it's compatible with mmap.

最后,读/写是您可以处理某些类型文件的唯一方式.mmap 不能用于 pipesttys.

And finally, read / write are the only way you can work with some types of files. mmap can't be used on things like pipes and ttys.

这篇关于什么时候应该使用 mmap 进行文件访问?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆