在 C# 中逐行读取文件 [英] Reading a file line by line in C#

查看:41
本文介绍了在 C# 中逐行读取文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在尝试读取一些文本文件,其中每一行都需要处理.目前我只是使用 StreamReader,然后单独阅读每一行.

I am trying to read some text files, where each line needs to be processed. At the moment I am just using a StreamReader, and then reading each line individually.

我想知道是否有更有效的方法(在 LoC 和可读性方面)使用 LINQ 来做到这一点,而不会影响操作效率.我所看到的示例涉及将整个文件加载到内存中,然后对其进行处理.但是,在这种情况下,我认为这不会非常有效.在第一个示例中,文件可以达到大约 50k,而在第二个示例中,不需要读取文件的所有行(大小通常小于 10k).

I am wondering whether there is a more efficient way (in terms of LoC and readability) to do this using LINQ without compromising operational efficiency. The examples I have seen involve loading the whole file into memory, and then processing it. In this case however I don't believe that would be very efficient. In the first example the files can get up to about 50k, and in the second example, not all lines of the file need to be read (sizes are typically < 10k).

您可能会争辩说,现在这些小文件并不重要,但我相信这种方法会导致代码效率低下.

You could argue that nowadays it doesn't really matter for these small files, however I believe that sort of the approach leads to inefficient code.

第一个例子:

// Open file
using(var file = System.IO.File.OpenText(_LstFilename))
{
    // Read file
    while (!file.EndOfStream)
    {
        String line = file.ReadLine();

        // Ignore empty lines
        if (line.Length > 0)
        {
            // Create addon
            T addon = new T();
            addon.Load(line, _BaseDir);

            // Add to collection
            collection.Add(addon);
        }
    }
}

第二个例子:

// Open file
using (var file = System.IO.File.OpenText(datFile))
{
    // Compile regexs
    Regex nameRegex = new Regex("IDENTIFY (.*)");

    while (!file.EndOfStream)
    {
        String line = file.ReadLine();

        // Check name
        Match m = nameRegex.Match(line);
        if (m.Success)
        {
            _Name = m.Groups[1].Value;

            // Remove me when other values are read
            break;
        }
    }
}

推荐答案

您可以使用迭代器块非常轻松地编写基于 LINQ 的行阅读器:

You can write a LINQ-based line reader pretty easily using an iterator block:

static IEnumerable<SomeType> ReadFrom(string file) {
    string line;
    using(var reader = File.OpenText(file)) {
        while((line = reader.ReadLine()) != null) {
            SomeType newRecord = /* parse line */
            yield return newRecord;
        }
    }
}

或者让乔恩开心:

static IEnumerable<string> ReadFrom(string file) {
    string line;
    using(var reader = File.OpenText(file)) {
        while((line = reader.ReadLine()) != null) {
            yield return line;
        }
    }
}
...
var typedSequence = from line in ReadFrom(path)
                    let record = ParseLine(line)
                    where record.Active // for example
                    select record.Key;

然后你有 ReadFrom(...) 作为一个没有缓冲的惰性求值序列,非常适合 Where

then you have ReadFrom(...) as a lazily evaluated sequence without buffering, perfect for Where etc.

注意,如果你使用OrderBy或标准的GroupBy,它必须将数据缓存在内存中;如果你需要分组和聚合,PushLINQ"有一些花哨的代码,允许你对数据执行聚合但丢弃它(无缓冲).乔恩的解释 在这里.

Note that if you use OrderBy or the standard GroupBy, it will have to buffer the data in memory; ifyou need grouping and aggregation, "PushLINQ" has some fancy code to allow you to perform aggregations on the data but discard it (no buffering). Jon's explanation is here.

这篇关于在 C# 中逐行读取文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆