读取大文本文件(200Mb) [英] Reading a BIG text file (200Mb)

查看:115
本文介绍了读取大文本文件(200Mb)的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

嗨!


我正在开发一个应用程序,用于分析文本文件中的一些信息

。文本文件的大小从50Mb到

220Mb ...


文本文件就像一个表格,其中包含行和行。和列,和几个

单元格和代表我的应用程序中的对象。


我需要做的是,逐行阅读文本文件,解析

信息,创建我的对象和保存它smewhere(ArrayList)以便

之后我可以处理它。


问题是,如何以有效的方式做到这一点?


提前感谢您的帮助,


ACC

解决方案

在.NET 2.0中。 System.IO.File类有一个ReadAllLines(" file.txt")

,它接受一个文件路径。该方法返回该文件中所有

行的sting数组。每行由一个回车分隔,行
feed。然后,你可以通过使用一些分隔符进行分割来进一步分开每一行

。 .NET框架将为你读取所有文件

,所以也许他们最有效率地做这件事。


我希望这会有所帮助。 />


ACC,


你有什么理由需要你的数组列表中的所有项目

一次?你能批量处理吗?


你的档案格式是什么?如果它是一个分隔的文本文件,你可能想要考虑通过Bulk

复制服务将它加载到SQL Server中的表中,然后在表上执行操作。当然,

这只适用于你进行某种设定操作的情况。

如果你不是(你正在做一些超级复杂的计算),那么这个

是不合适的。


最终,处理每个对象是最好的方式,并且

只保留你需要的东西(而不是所有的东西)。


希望这会有所帮助。


-

- Nicholas Paldino [.NET / C#MVP]

- mv*@spam.guard.caspershouse.com


" ACC" <毫安******* @ gmail.com>在消息中写道

news:11 ********************** @ g47g2000cwa.googlegr oups.com ...

嗨!

我正在开发一个应用程序来分析文本文件中的一些信息。文本文件的大小从50Mb到
220Mb ......

文本文件就像一个表,带有行。和列,以及几个单元和单元。代表我的应用程序中的一个对象。

我需要做的是,逐行读取文本文件,解析
信息,创建我的对象并将其保存为smewhere(ArrayList),以便
之后我可以处理它。

问题是,如何以有效的方式做到这一点?

提前感谢您的帮助,< ACC



这是一个非常糟糕的主意。如果文件是50-200MB,那么将文件中的所有行读成字符串(或字符串数​​组)将是一个

的性能噩梦。< br $> b $ b -

- Nicholas Paldino [.NET / C#MVP]

- mv*@spam.guard.caspershouse.com


" tdavisjr" < TD ****** @ gmail.com>在消息中写道

news:11 ********************* @ g47g2000cwa.googlegro ups.com ...

在.NET 2.0中。 System.IO.File类有一个ReadAllLines(" file.txt")
,它接受一个文件路径。该方法返回该文件中所有
行的sting数组。每行由一个回车,行
分隔。然后,您可以通过使用某些分隔符进行拆分来进一步分开每一行。 .NET框架将为您完成所有文件阅读,因此他们可能会最有效地进行阅读。

我希望这会有所帮助。



Hi!

I''m developing an application to analyze some information that is
inside a text file. The size of the text file goes from 50Mb to
220Mb...

The text file is like a table, with "rows" and "columns", and several
"cells" represent an objects in my application.

What I need to do is, read the text file line by line, parse the
information, create my objects and save it smewhere (ArrayList) so that
I can process it afterwards.

And the question is, how to do it in a efficient way??

Thank you in advance for your help,

ACC

解决方案

In .NET 2.0 . The System.IO.File class has a ReadAllLines("file.txt")
that accepts a filepath. That method returns a sting array of all the
lines in that file. Each line is seperated by a carriage return, line
feed. You can then further break each line apart by doing a split
using some delimiter. The .NET framework will be doing all of the file
reading for you so maybe they are doing it most effeciently.

I hope this helps.


ACC,

Is there any reason why you need all of the items in your array list at
one time? Can you process it in batches?

What is the format of your file? If it is a delimited text file, you
might want to consider loading it into a table in SQL Server through Bulk
Copy Services, and then perform your operations on the table. Of course,
this is only applicable if you are performing some sort of set operations.
If you are not (there is some super complex calc you are doing), then this
isn''t appropriate.

Ultimately, processing each object as it comes in is the best way, and
only keeping around what you need (instead of all of them).

Hope this helps.

--
- Nicholas Paldino [.NET/C# MVP]
- mv*@spam.guard.caspershouse.com

"ACC" <ma*******@gmail.com> wrote in message
news:11**********************@g47g2000cwa.googlegr oups.com...

Hi!

I''m developing an application to analyze some information that is
inside a text file. The size of the text file goes from 50Mb to
220Mb...

The text file is like a table, with "rows" and "columns", and several
"cells" represent an objects in my application.

What I need to do is, read the text file line by line, parse the
information, create my objects and save it smewhere (ArrayList) so that
I can process it afterwards.

And the question is, how to do it in a efficient way??

Thank you in advance for your help,

ACC



This would be a VERY bad idea. If the file is 50-200MB, reading all of
those lines in the file into a string (or array of strings) is going to be a
performance nightmare.
--
- Nicholas Paldino [.NET/C# MVP]
- mv*@spam.guard.caspershouse.com

"tdavisjr" <td******@gmail.com> wrote in message
news:11*********************@g47g2000cwa.googlegro ups.com...

In .NET 2.0 . The System.IO.File class has a ReadAllLines("file.txt")
that accepts a filepath. That method returns a sting array of all the
lines in that file. Each line is seperated by a carriage return, line
feed. You can then further break each line apart by doing a split
using some delimiter. The .NET framework will be doing all of the file
reading for you so maybe they are doing it most effeciently.

I hope this helps.



这篇关于读取大文本文件(200Mb)的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆