在C#中逐行读取文件 [英] Reading a file line by line in C#

查看:138
本文介绍了在C#中逐行读取文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试读取一些文本文件,其中每一行都需要处理.目前,我只是使用StreamReader,然后逐一读取每一行.

I am trying to read some text files, where each line needs to be processed. At the moment I am just using a StreamReader, and then reading each line individually.

我想知道是否有一种更有效的方法(就LoC和可读性而言)使用LINQ来做到这一点,而又不影响运营效率.我看到的示例涉及将整个文件加载到内存中,然后对其进行处理.但是,在这种情况下,我认为这样做不会非常有效.在第一个示例中,文件最多可以存储50k,在第二个示例中,不需要读取文件的所有行(大小通常小于10k).

I am wondering whether there is a more efficient way (in terms of LoC and readability) to do this using LINQ without compromising operational efficiency. The examples I have seen involve loading the whole file into memory, and then processing it. In this case however I don't believe that would be very efficient. In the first example the files can get up to about 50k, and in the second example, not all lines of the file need to be read (sizes are typically < 10k).

您可能会争辩说,如今对于这些小文件而言,它已不再重要,但是我认为这种方法会导致代码效率低下.

You could argue that nowadays it doesn't really matter for these small files, however I believe that sort of the approach leads to inefficient code.

第一个示例:

// Open file
using(var file = System.IO.File.OpenText(_LstFilename))
{
    // Read file
    while (!file.EndOfStream)
    {
        String line = file.ReadLine();

        // Ignore empty lines
        if (line.Length > 0)
        {
            // Create addon
            T addon = new T();
            addon.Load(line, _BaseDir);

            // Add to collection
            collection.Add(addon);
        }
    }
}

第二个例子:

// Open file
using (var file = System.IO.File.OpenText(datFile))
{
    // Compile regexs
    Regex nameRegex = new Regex("IDENTIFY (.*)");

    while (!file.EndOfStream)
    {
        String line = file.ReadLine();

        // Check name
        Match m = nameRegex.Match(line);
        if (m.Success)
        {
            _Name = m.Groups[1].Value;

            // Remove me when other values are read
            break;
        }
    }
}

推荐答案

您可以使用迭代器块轻松地编写基于LINQ的行读取器:

You can write a LINQ-based line reader pretty easily using an iterator block:

static IEnumerable<SomeType> ReadFrom(string file) {
    string line;
    using(var reader = File.OpenText(file)) {
        while((line = reader.ReadLine()) != null) {
            SomeType newRecord = /* parse line */
            yield return newRecord;
        }
    }
}

或者让乔恩开心:

static IEnumerable<string> ReadFrom(string file) {
    string line;
    using(var reader = File.OpenText(file)) {
        while((line = reader.ReadLine()) != null) {
            yield return line;
        }
    }
}
...
var typedSequence = from line in ReadFrom(path)
                    let record = ParseLine(line)
                    where record.Active // for example
                    select record.Key;

然后,您将ReadFrom(...)作为无需缓冲的惰性计算序列,非常适合Where等.

then you have ReadFrom(...) as a lazily evaluated sequence without buffering, perfect for Where etc.

请注意,如果使用OrderBy或标准的GroupBy,则必须将数据缓冲在内存中;如果您需要分组和聚合,则"PushLINQ"具有一些精美的代码,可让您对数据执行聚合但将其丢弃(不进行缓冲).乔恩的解释在这里.

Note that if you use OrderBy or the standard GroupBy, it will have to buffer the data in memory; ifyou need grouping and aggregation, "PushLINQ" has some fancy code to allow you to perform aggregations on the data but discard it (no buffering). Jon's explanation is here.

这篇关于在C#中逐行读取文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆