加快文件I / O:mmap()的与阅读() [英] Speeding up file I/O: mmap() vs. read()

查看:132
本文介绍了加快文件I / O:mmap()的与阅读()的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个Linux应用程序,在并行读取150-200文件(4-10GB)。每个文件依次读取小,大小不等的块,通常每次不少于2K。

I have a Linux application that reads 150-200 files (4-10GB) in parallel. Each file is read in turn in small, variably sized blocks, typically less than 2K each.

我目前需要保持超过200 MB / s的读取速度从组文件相结合。磁盘处理这个就好了。有超过1 GB / s的预期要求(这是从磁盘的遥不可及的时刻)。

I currently need to maintain over 200 MB/s read rate combined from the set of files. The disks handle this just fine. There is a projected requirement of over 1 GB/s (which is out of the disk's reach at the moment).

我们已经实现了两个不同的读取系统都大量使用 posix_advise :第一个是 MMAP 编读入我们映射数据集的全部和按需读取。
第二个是阅读() / 求()为基础的系统。

We have implemented two different read systems both make heavy use of posix_advise: first is a mmaped read in which we map the entirety of the data set and read on demand. The second is a read()/seek() based system.

都工作良好,但只适用于中度的情况下,阅读()方法管理我们的整个文件缓存好得多,可与GB的文件的100S处理好,但被严重速率限制, MMAP 能够pre-缓存数据制作的持续数据传输率超过200MB / s的易于维护,但不能与大总的数据处理集大小。

Both work well but only for the moderate cases, the read() method manages our overall file cache much better and can deal well with 100s of GB of files, but is badly rate limited, mmap is able to pre-cache data making the sustained data rate of over 200MB/s easy to maintain, but cannot deal with large total data set sizes.

所以我的问题来考虑:

答:能阅读()键入文件I / O超出了 posix_advise 在Linux上调用进一步优化,或在调整磁盘调度程序,VMM和posix_advise电话是一样好,我们可以期待?

A: Can read() type file i/o be further optimized beyond the posix_advise calls on Linux, or having tuned the disk scheduler, VMM and posix_advise calls is that as good as we can expect?

A:是的mmap有系统的方法来更好地处理非常大的映射数据?

B: Are there systematic ways for mmap to better deal with very large mapped data?

MMAP-VS-读块
类似的问题是什么,我在 mmap-工作,对这个问题提供了一个很好的起点,随着一起讨论VS-阅读

推荐答案

回读什么?这是什么数据的最终目的地

Reads back to what? What is the final destination of this data?

因为它听起来像你是完全绑定的IO, MMAP 应该没有什么区别。有趣的部分是你如何获取数据到您的接收器。

Since it sounds like you are completely IO bound, mmap and read should make no difference. The interesting part is in how you get the data to your receiver.

假设你把这个数据到一个管道,我建议你只转储每个文件的内容全部进入管道。要做到这一点使用零拷贝,请尝试 拼接 系统调用。您也可以尝试手动复制文件,或分叉或其他一些工具,可以与当前文件作为标准输入大量缓冲区和管作为标准输出的一个实例。

Assuming you're putting this data to a pipe, I recommend you just dump the contents of each file in its entirety into the pipe. To do this using zero-copy, try the splice system call. You might also try copying the file manually, or forking an instance of cat or some other tool that can buffer heavily with the current file as stdin, and the pipe as stdout.

if (pid = fork()) {
    waitpid(pid, ...);
} else {
    dup2(dest, 1);
    dup2(source, 0);
    execlp("cat", "cat");
}

Update0

如果您的处理文件无关,并且不需要随机访问,要创建的使用上面概述的选项的管道。你的处理步骤应该接受从标准输入数据,或管道。

Update0

If your processing is file-agnostic, and doesn't require random access, you want to create a pipeline using the options outlined above. Your processing step should accept data from stdin, or a pipe.

要回答你的更具体的问题:

To answer your more specific questions:

?答:能阅读()类型的文件I / O将进一步超越Linux上的posix_advise来电优化,或在调整磁盘调度程序,VMM和posix_advise呼吁是因为我们可以预期的好

A: Can read() type file i/o be further optimized beyond the posix_advise calls on Linux, or having tuned the disk scheduler, VMM and posix_advise calls is that as good as we can expect?

这是好,因为它是关于告诉内核如何从用户空间做得到。剩下的就是给你:缓冲,线程等,但它是危险的,可能是非生产性的猜测工作。我只是用拼接去文件到管道。

That's as good as it gets with regard to telling the kernel what to do from userspace. The rest is up to you: buffering, threading etc. but it's dangerous and probably unproductive guess work. I'd just go with splicing the files into a pipe.

乙:是否有MMAP系统的方法来更好地处理非常大的数据映射

B: Are there systematic ways for mmap to better deal with very large mapped data?

是的。该以下可能会给你真棒性能优势(而且可能使MMAP值得使用过读,与测试)选项:

Yes. The following options may give you awesome performance benefits (and may make mmap worth using over read, with testing):


  • MAP_HUGETLB
    分配使用映射大内存页。

  • MAP_HUGETLB Allocate the mapping using "huge pages."

这将减少在内核中,这是伟大的分页开销,如果你将映射千兆字节大小的文件。

This will reduce the paging overhead in the kernel, which is great if you will be mapping gigabyte sized files.

MAP_NORESERVE
不要为这个映射保留交换空间。当交换空间被保留,一有保证,这是可以修改的映射。当交换空间不保留了一个可能,如果没有物理内存可用在写得到SIGSEGV。

MAP_NORESERVE Do not reserve swap space for this mapping. When swap space is reserved, one has the guarantee that it is possible to modify the mapping. When swap space is not reserved one might get SIGSEGV upon a write if no physical memory is available.

这将美元,你耗尽内存,同时保持你实现简单,如果你实际上并不具备足够的物理内存+交换整个映射p $ pvent。**

This will prevent you running out of memory while keeping your implementation simple if you don't actually have enough physical memory + swap for the entire mapping.**

MAP_POPULATE
填充的映射(prefault)页表。对于文件的映射,这将导致该文件预读。后来访问映射不会被页故障被阻止。

MAP_POPULATE Populate (prefault) page tables for a mapping. For a file mapping, this causes read-ahead on the file. Later accesses to the mapping will not be blocked by page faults.

这可能会给你速度提升有足够的硬件资源,而如果prefetching是有序的,和懒惰。我怀疑这个标志是多余的,在VFS可能确实在默认情况下这更好的。

This may give you speed-ups with sufficient hardware resources, and if the prefetching is ordered, and lazy. I suspect this flag is redundant, the VFS likely does this better by default.

这篇关于加快文件I / O:mmap()的与阅读()的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆