使用.NET C#解析平面文件? [英] Flat file parsing using .NET C# ?

查看:70
本文介绍了使用.NET C#解析平面文件?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我有一个包含卡号的文件。它有100万个卡号我需要检查此文件中的重复卡号并将其报告在其他文件中并生成包含所有唯一卡号的其他文件。我不能使用ant数据库进行此操作。请指导哪种方法是在最短时间内完成此操作。

HI i have a file containg card numbers . It has 1 million card numbers i need to check for duplicate card numbers in this file and report it in other file and generate other file containing all unique card numbers . I cant use ant database for this operation .Please guide which is the best way to do this in minimum time .

推荐答案

将文件读入字符串数组,对其进行排序,并查找相同的值,它们将彼此相邻。



或者,使用Linq方法:

Read the file into a string array, sort it, and run through looking for identical values, which will be next to each other.

Or, use Linq methods:
string[] lines = File.ReadAllLines(@"D:\Temp\101.txt");
string[] dups = lines.GroupBy(i => i).Where(g => g.Count() > 1).Select(g => g.Key).ToArray();


或者也许:

Or perhaps:
List<string> uniqueElements = (from w in System.IO.File.ReadAllLines("File.txt") 
                          select w).Distinct().ToList(); 
</string>





问候

Espen Harlinn



Regards
Espen Harlinn

我建​​议在一个单独的线程中逐行解析文件,以便您的应用程序仍然响应,您可以随时取消操作。将整个文件加载到内存中也不是一个好主意,因为根据文件包含多少数据,100万行可能会占用一些内存。



你可以尝试
I would suggest to parse the file line by line in a separate thread so that the your application still responsive and you can cancel the operation at any time you want. Also its not good idea to load the whole file in memory since 1 million line could eat some memory based on how much data the file contains.

you may try
TextReader or StreamReader ReadLine method.



各种阅读方法也可供您选择。

阅读更多TextReader.ReadLine [ ^ ]




varieties of read methods also there for your choice.
Read more TextReader.ReadLine in MSDN[^]

FileStream fileStream = new FileStream(unit.FileName, FileMode.Open, FileAccess.Read, FileShare.Read, 256,FileOptions.SequentialScan);
StreamReader streamReader = new StreamReader(fileStream);

while(true)
{
 string line = streamReader.ReadLine(); // reads a line and moves the file pointer to next line
 if(line == null)
 { 
   break; you reached the end of line.
 } 
 //parse the line
}
streamReader .Close()
fileStream .Close();

如果从上到下阅读,FileOptions.SequentialScan将加速文件读取

"FileOptions.SequentialScan" will speedup your file reading if you are reading from the top to bottom


这篇关于使用.NET C#解析平面文件?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆