读取内存中的整个文件和块读取 [英] Read whole file in memory VS read in chunks

查看:805
本文介绍了读取内存中的整个文件和块读取的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我对C#和编程比较陌生,请耐心等待。我正在工作一个应用程序,我需要读取一些文件,并处理这些文件(例如数据是以48字节块处理的)。



我会喜欢知道什么更好,性能明智,在内存中一次读取整个文件,然后处理它,或以块读取文件,并直接处理它们或以较大的块读取数据(多个数据块,然后处理)



到目前为止,我对事物的理解如下:

在内存中读取整个文件

优点:

- 它的速度很快,因为最耗时的操作是在寻找,一旦头部到位, p>

cons:

- 消耗大量内存 -
- 消耗大量内存在很短的时间内(这是我主要害怕的,因为我不希望它会显着影响整体系统性能)
$ b

以块读取文件

优点:
- 实现更简单(更直观)


 <$ c $ (numberOfBytes2Read> 0)
读取n字节
进程读取数据

- 消耗很少记忆

cons:

- 如果磁盘可能需要更多时间必须再次查找文件并将头部移动到适当的位置,平均花费约12ms。

我知道答案取决于文件大小(和硬件) 。我认为一次读取整个文件会更好,但是对于大文件是多少这是真的,一次读入内存的最大建议大小是多少(以字节为单位还是相对于硬件 - 例如% RAM)?



感谢您的回答和时间。 div>

建议读取 4K或8K 的缓冲区中的文件。



如果你想把它写回另一个流,你应该永远不要一次读取所有的文件。只需读取一个缓冲区并将缓冲区写回。如果你必须加载整个文件,因为你的操作(文本处理等)需要整个文件的内容,缓冲并没有真正的帮助,所以我相信这是可取的来使用 File.ReadAllText File.ReadAllBytes

$ b

为什么是4KB或8KB?



这更接近底层的Windows操作系统缓冲区。 NTFS中的文件通常存储在磁盘上的4KB或8KB的块中,尽管你可以选择32KB的块。


I'm relatively new to C# and programming, so please bear with me. I'm working an an application where I need to read some files and process those files in chunks (for example data is processed in chunks of 48 bytes).

I would like to know what is better, performance-wise, to read the whole file at once in memory and then process it or to read file in chunks and process them directly or to read data in larger chunks (multiple chunks of data which are then processed).

How I understand things so far:

Read whole file in memory
pros:
-It's fast, because the most time expensive operation is seeking, once the head is in place it can read quite fast

cons:
-It consumes a lot of memory
-It consumes a lot of memory in very short time ( This is what I am mainly afraid of, because I do not want that it noticeably impacts overall system performance)

Read file in chunks
pros:
-It's easier (more intuitive) to implement

while(numberOfBytes2Read > 0)
   read n bytes
   process read data

-It consumes very little memory

cons:
-It could take much more time, if the disk has to seek the file again and move the head to the appropriate position, which in average costs around 12ms.

I know that the answer depends on file size (and hardware). I assume it is better to read the whole file at once, but for how large files is this true, what is the maximum recommended size to read in memory at once (in bytes or relative to the hardware - for example % of RAM)?

Thank you for your answers and time.

解决方案

It is recommended to read files in buffers of 4K or 8K.

You should really never read files all at once if you want to write it back to another stream. Just read to a buffer and write the buffer back. This is especially through for web programming.

If you have to load the whole file since your operation (text-processing, etc) needs the whole content of the file, buffering does not really help, so I believe it is preferable to use File.ReadAllText or File.ReadAllBytes.


Why 4KB or 8KB?

This is closer to the underlying Windows operating system buffers. Files in NTFS are normally stored in 4KB or 8KB chuncks on the disk although you can choose 32KB chuncks

这篇关于读取内存中的整个文件和块读取的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆