在C#中读取超大文本文件的最快方法 [英] Fastest way to read very large text file in C#

查看:2970
本文介绍了在C#中读取超大文本文件的最快方法的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个非常基本的问题.我有几个文本文件,每个文件的大小均为GB.我有一个C#WPF应用程序,用于处理类似的数据文件,但大小不尽相同(目前大约为200-300mb).如何有效地读取此数据,然后在处理后将其写入其他位置,而不会冻结和崩溃?本质上,从大文件读取的最佳方法是什么?对于我现在的小规模应用程序,我使用System.IO.File.ReadAllLines进行读取,并使用streamwriter进行写入.我敢肯定,对于这么大的文件,这2种方法并不是最好的主意.我没有C#的丰富经验,将不胜感激!

I have a very basic question. I have several text files with data which are several GB's in size each. I have a C# WPF application which I'm using to process similar data files but nowhere close to that size (probably around 200-300mb right now). How can I efficiently read this data and then write it somewhere else after processing without everything freezing and crashing? Essentially whats the best way to read from a very large file? For my low scale application right now, I use System.IO.File.ReadAllLines to read and a streamwriter to write. I'm sure those 2 methods are not the best idea for such large files. I don't have much experience with C#, any help will be appreciated!

推荐答案

如果您可以逐行执行此操作,则答案很简单:

If you can do this line by line then the answer is simple:

  1. 阅读一行.
  2. 处理该行.
  3. 写线.

如果希望它更快一点,则将它们放在三个BlockingCollections中,并指定上限(例如10),因此较慢的步骤永远不会等待较快的步骤.如果可以输出到其他物理光盘(如果输出到光盘).

If you want it to go a bit faster, put those in three BlockingCollections with a specified upper bound of something like 10, so a slower step is never waiting on a faster step. If you can output to a different physical disc (if output is to disc).

即使在询问过程是否逐行(两次)后,OP也会更改规则.

OP changed the rules even after being asked if the process was line by line (twice).

  1. 读取行以生成工作单元(打开以关闭标签).
  2. 工作单元.
  3. 写工作单元.

这篇关于在C#中读取超大文本文件的最快方法的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆