带有流的文件 I/O - 最佳内存缓冲区大小 [英] File I/O with streams - best memory buffer size

查看:41
本文介绍了带有流的文件 I/O - 最佳内存缓冲区大小的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在编写一个小型 I/O 库来协助一个更大的(爱好)项目.该库的一部分对文件执行各种功能,该文件通过 FileStream 对象读取/写入.在每个 StreamReader.Read(...) 传递中,

I am writing a small I/O library to assist with a larger (hobby) project. A part of this library performs various functions on a file, which is read / written via the FileStream object. On each StreamReader.Read(...) pass,

我触发了一个事件,该事件将在主应用程序中用于显示进度信息.在循环中进行的处理是不同的,但不会太耗时(例如,它可能只是一个简单的文件副本,或者可能涉及加密......).

I fire off an event which will be used in the main app to display progress information. The processing that goes on in the loop is vaired, but is not too time consuming (it could just be a simple file copy, for example, or may involve encryption...).

我的主要问题是:使用的最佳内存缓冲区大小是多少?考虑物理磁盘布局,我可以选择 2k,这将覆盖 CD 扇区大小,并且是 512 字节硬盘扇区的一个很好的倍数.在抽象树的更高位置,您可以使用更大的缓冲区,一次可以读取整个 FAT 簇.我意识到在今天的 PC 上,我可以选择更消耗内存的选项(例如,几个 MiB),但随后我增加了 UI 更新之间的时间,并且用户会感觉到响应速度较慢的应用程序.

My main question is: What is the best memory buffer size to use? Thinking about physical disk layouts, I could pick 2k, which would cover a CD sector size and is a nice multiple of a 512 bytes hard disk sector. Higher up the abstraction tree, you could go for a larger buffer which could read an entire FAT cluster at a time. I realise with today's PC's, I could go for a more memory hungry option (a couple of MiB, for example), but then I increase the time between UI updates and the user perceives a less responsive application.

顺便说一句,我最终希望为 FTP/HTTP 服务器(通过本地网络/快速 DSL)上托管的文件提供类似的接口.什么是最佳内存缓冲区大小(同样,感知响应与性能之间的最佳情况"权衡)?

As an aside, I'm eventually hoping to provide a similar interface to files hosted on FTP / HTTP servers (over a local network / fastish DSL). What would be the best memory buffer size for those (again, a "best-case" tradeoff between perceived responsiveness vs. performance)?

推荐答案

文件已被文件系统缓存缓冲.您只需要选择一个不会强制 FileStream 使本机 Windows ReadFile() API 调用过于频繁地填充缓冲区的缓冲区大小.不要低于 1 KB,超过 16 KB 会浪费内存,而且对 CPU 的 L1 缓存(通常为 16 或 32 KB 的数据).

Files are already buffered by the file system cache. You just need to pick a buffer size that doesn't force FileStream to make the native Windows ReadFile() API call to fill the buffer too often. Don't go below a kilobyte, more than 16 KB is a waste of memory and unfriendly to the CPU's L1 cache (typically 16 or 32 KB of data).

4 KB 是传统的选择,即使这只是偶然地跨越虚拟内存页面.很难描述;您最终将测量读取缓存文件所需的时间.如果数据在缓存中可用,它以 RAM 速度运行,5 GB/秒或更高.第二次运行测试时它会在缓存中,这在生产环境中不会经常发生.文件 I/O 完全由磁盘驱动器或 NIC 控制,并且速度极慢,复制数据是花生.4 KB 可以正常工作.

4 KB is a traditional choice, even though that will exactly span a virtual memory page only ever by accident. It is difficult to profile; you'll end up measuring how long it takes to read a cached file. Which runs at RAM speeds, 5 gigabytes/sec and up if the data is available in the cache. It will be in the cache the second time you run your test, and that won't happen in a production environment too often. File I/O is completely dominated by the disk drive or the NIC and is glacially slow, copying the data is peanuts. 4 KB will work fine.

这篇关于带有流的文件 I/O - 最佳内存缓冲区大小的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆