读取内存中的整个文件和块读取 [英] Read whole file in memory VS read in chunks
问题描述
我会喜欢知道什么更好,性能明智,在内存中一次读取整个文件,然后处理它,或以块读取文件,并直接处理它们或以较大的块读取数据(多个数据块,然后处理)
到目前为止,我对事物的理解如下:
在内存中读取整个文件
优点:
- 它的速度很快,因为最耗时的操作是在寻找,一旦头部到位, p>
cons:
- 消耗大量内存 -
- 消耗大量内存在很短的时间内(这是我主要害怕的,因为我不希望它会显着影响整体系统性能)
$ b
以块读取文件
优点:
- 实现更简单(更直观)
<$ c $ (numberOfBytes2Read> 0)
读取n字节
进程读取数据
- 消耗很少记忆
cons:
- 如果磁盘可能需要更多时间必须再次查找文件并将头部移动到适当的位置,平均花费约12ms。
我知道答案取决于文件大小(和硬件) 。我认为一次读取整个文件会更好,但是对于大文件是多少这是真的,一次读入内存的最大建议大小是多少(以字节为单位还是相对于硬件 - 例如% RAM)?
感谢您的回答和时间。 div>
建议读取 4K或8K 的缓冲区中的文件。
如果你想把它写回另一个流,你应该永远不要一次读取所有的文件。只需读取一个缓冲区并将缓冲区写回。如果你必须加载整个文件,因为你的操作(文本处理等)需要整个文件的内容,缓冲并没有真正的帮助,所以我相信这是可取的来使用 这更接近底层的Windows操作系统缓冲区。 NTFS中的文件通常存储在磁盘上的4KB或8KB的块中,尽管你可以选择32KB的块。 I'm relatively new to C# and programming, so please bear with me. I'm working an an application where I need to read some files and process those files in chunks (for example data is processed in chunks of 48 bytes). I would like to know what is better, performance-wise, to read the whole file at once in memory and then process it or to read file in chunks and process them directly or to read data in larger chunks (multiple chunks of data which are then processed). How I understand things so far: Read whole file in memory cons: Read file in chunks -It consumes very little memory cons: I know that the answer depends on file size (and hardware). I assume it is better to read the whole file at once, but for how large files is this true, what is the maximum recommended size to read in memory at once (in bytes or relative to the hardware - for example % of RAM)? Thank you for your answers and time. It is recommended to read files in buffers of 4K or 8K. You should really never read files all at once if you want to write it back to another stream. Just read to a buffer and write the buffer back. This is especially through for web programming. If you have to load the whole file since your operation (text-processing, etc) needs the whole content of the file, buffering does not really help, so I believe it is preferable to use This is closer to the underlying Windows operating system buffers. Files in NTFS are normally stored in 4KB or 8KB chuncks on the disk although you can choose 32KB chuncks 这篇关于读取内存中的整个文件和块读取的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋! File.ReadAllText
或 File.ReadAllBytes $ c
$ b 为什么是4KB或8KB?
pros:
-It's fast, because the most time expensive operation is seeking, once the head is in place it can read quite fast
-It consumes a lot of memory
-It consumes a lot of memory in very short time ( This is what I am mainly afraid of, because I do not want that it noticeably impacts overall system performance)
pros:
-It's easier (more intuitive) to implementwhile(numberOfBytes2Read > 0)
read n bytes
process read data
-It could take much more time, if the disk has to seek the file again and move the head to the appropriate position, which in average costs around 12ms.File.ReadAllText
or File.ReadAllBytes
.
Why 4KB or 8KB?