在C#中读取超大文本文件的最快方法 [英] Fastest way to read very large text file in C#
问题描述
我有一个非常基本的问题.我有几个文本文件,每个文件的大小均为GB.我有一个C#WPF应用程序,用于处理类似的数据文件,但大小不尽相同(目前大约为200-300mb).如何有效地读取此数据,然后在处理后将其写入其他位置,而不会冻结和崩溃?本质上,从大文件读取的最佳方法是什么?对于我现在的小规模应用程序,我使用System.IO.File.ReadAllLines
进行读取,并使用streamwriter
进行写入.我敢肯定,对于这么大的文件,这2种方法并不是最好的主意.我没有C#的丰富经验,将不胜感激!
I have a very basic question. I have several text files with data which are several GB's in size each. I have a C# WPF application which I'm using to process similar data files but nowhere close to that size (probably around 200-300mb right now). How can I efficiently read this data and then write it somewhere else after processing without everything freezing and crashing? Essentially whats the best way to read from a very large file? For my low scale application right now, I use System.IO.File.ReadAllLines
to read and a streamwriter
to write. I'm sure those 2 methods are not the best idea for such large files. I don't have much experience with C#, any help will be appreciated!
推荐答案
如果您可以逐行执行此操作,则答案很简单:
If you can do this line by line then the answer is simple:
- 阅读一行.
- 处理该行.
- 写线.
如果希望它更快一点,则将它们放在三个BlockingCollections
中,并指定上限(例如10),因此较慢的步骤永远不会等待较快的步骤.如果可以输出到其他物理光盘(如果输出到光盘).
If you want it to go a bit faster, put those in three BlockingCollections
with a specified upper bound of something like 10, so a slower step is never waiting on a faster step. If you can output to a different physical disc (if output is to disc).
即使在询问过程是否逐行(两次)后,OP也会更改规则.
OP changed the rules even after being asked if the process was line by line (twice).
- 读取行以生成工作单元(打开以关闭标签).
- 工作单元.
- 写工作单元.
这篇关于在C#中读取超大文本文件的最快方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!