文本文件解析-如何搜索特定字符串并返回整行? [英] Text file parsing - how can I search for a specific string and return whole line?

查看:50
本文介绍了文本文件解析-如何搜索特定字符串并返回整行?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

例如txt文件中包含以下条目:

e.g txt file has following entries say :

england is cold country
India is poor country
england is cold country
england is cold country
India is poor country
english county cricket season.

现在我想在此txt文件中搜索字符串"england",并返回包含此字符串的整行.我如何使用C Sharp语言来做到这一点?

now i want to search this txt file for a string "england" and return the entire line containing this string. How can i do it using C sharp language ?

推荐答案

我会考虑两种方法,即大文件(兆字节)和相对较小的文件.

I would consider two approaches, for large file (megabytes) and for relatively small.

如果文件很大并且包含兆字节的数据:使用流读取器,读取文件直到EndOfLine,对刚刚读取的字符串进行分析

If file is large and contains megabytes of data: use stream reader, read file untile EndOfLine, analize just readed string

string pattern = "england";
IList<string> result = new List<string>();
using (var reader = new StreamReader("TestFile.txt")) 
{
    string currentLine;
    while ((currentLine= reader.ReadLine()) != null) 
    {
        if (currentLine.Contains(pattern)
        {
            // if you do not need multiple lines and just the first one
            // just break from the loop (break;)            
            result.Add(currentLine);
        }
    }
}

小文件

如果文件很小,则可以使用帮助程序,该帮助程序将所有文件内容作为字符串数组返回-( File.ReadLines())不会读取整个文件,并且会作为延迟操作读取.

Small file

If a file is small you can use helper which returns all file content as array of strings - (File.ReadAllLines()) string per line and then use LINQ to search for substring. if you are using .NET 4 or newer you can leverage new helper (File.ReadLines()) which does not read entire file and does read as deffered operation.

.NET 2.0-3.5:

.NET 2.0 - 3.5:

string pattern = "england";
IEnumerable<string> result = File.ReadAllLines()
                                 .Where(l => l.Contains(pattern));

.NET4-4.5:

.NET4 - 4.5:

string pattern = "england";
IEnumerable<string> result = File.ReadLines()
                                 .Where(l => l.Contains(pattern));

如果只需要第一行,请使用 .FirstOrDefault(l => l.Contains(pattern)),而不是 Where(l => l.Contains(pattern))

if you need just the first line use .FirstOrDefault(l => l.Contains(pattern)) instead of Where(l => l.Contains(pattern))

MSDN :

ReadLines和ReadAllLines方法的区别如下:使用时ReadLines,您可以在开始之前枚举字符串集合整个集合被退回;当您使用ReadAllLines时,您必须等待返回整个字符串数组,然后才能访问数组.因此,当您处理非常大的文件时,ReadLines可以提高效率.

The ReadLines and ReadAllLines methods differ as follows: When you use ReadLines, you can start enumerating the collection of strings before the whole collection is returned; when you use ReadAllLines, you must wait for the whole array of strings be returned before you can access the array. Therefore, when you are working with very large files, ReadLines can be more efficient.

这篇关于文本文件解析-如何搜索特定字符串并返回整行?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆