为什么带有'直接'(O_DIRECT)标志的dd如此快得多? [英] Why is dd with the 'direct' (O_DIRECT) flag so dramatically faster?

查看:143
本文介绍了为什么带有'直接'(O_DIRECT)标志的dd如此快得多?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我的服务器RAID50配置为24个驱动器(两组12个),如果我运行:

I have a server with a RAID50 configuration of 24 drives (two groups of 12), and if I run:

dd if=/dev/zero of=ddfile2 bs=1M count=1953 oflag=direct

我得到:

2047868928 bytes (2.0 GB) copied, 0.805075 s, 2.5 GB/s

但如果我跑:

dd if=/dev/zero of=ddfile2 bs=1M count=1953

我得到:

2047868928 bytes (2.0 GB) copied, 2.53489 s, 808 MB/s

我知道O_DIRECT会导致页面缓存被绕过。但据我所知,绕过页面缓存基本上意味着避免使用memcpy。使用带宽工具在我的桌面上进行测试我的最差情况是顺序内存写入带宽为14GB / s,以及我想在更新的更昂贵的服务器上,带宽必须更好。那么为什么额外的memcpy会导致> 2x减速?使用页面缓存时是否真的涉及更多内容?这是不典型的吗?

I understand that O_DIRECT causes the page cache to be bypassed. But as I understand it bypassing the page cache basically means avoiding a memcpy. Testing on my desktop with the bandwidth tool I have a worst case sequential memory write bandwidth of 14GB/s, and I imagine on the newer much more expensive server the bandwidth must be even better. So why would an extra memcpy cause a >2x slowdown? Is there really a lot more involved when using the page cache? Is this atypical?

推荐答案

direct = 1 的情况下你'让内核能够立即写出数据(这意味着它不太可能被隐藏在无关数据的同步之后)并且你在CPU和时间方面保存内核工作(没有来自用户空间的额外副本)到内核​​,不需要执行缓冲区缓存管理操作)因此它是一条更快的路径。巨大的块大小很可能比RAID的块大小更大,因此它将在内核中分离,并且那些较小的块并行提交,因此通常使用微小的I / O进行缓冲写回的合并将不值得。

In the direct=1 case you're giving the kernel the ability to write data out straight away (which means it is less likely to be held up behind a sync of unrelated data) and you're saving the kernel work in terms of CPU and time (no extra copies from userland to the kernel, no need to perform buffer cache management operations) so it's a faster path. That giant block size is most likely bigger than the RAID's block size so it will be split up in the kernel and those smaller pieces submitted in parallel thus the coalescing you often get from buffered writeback with tiny I/Os won't be worth much.

总结:您的I / O模式并没有真正受益于缓冲(I / O很大,数据没有被重用,I / O是流顺序的)所以你处于 O_DIRECT 的最佳场景。请参阅这些由Linux的O_DIRECT原作者提供的幻灯片对于它背后的原始动机。

Summary: Your I/O pattern doesn't really benefit from buffering (I/Os are huge, data is not being reused, I/O is streaming sequential) so you're in an optimal scenario for O_DIRECT. See these slides by the original author of Linux's O_DIRECT for the original motivation behind it.

这篇关于为什么带有'直接'(O_DIRECT)标志的dd如此快得多?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆