C#读取文件一行行 [英] C# Reading a File Line By Line

查看:182
本文介绍了C#读取文件一行行的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想读一些文本文件,其中每行需要处理。此刻我只是用一个StreamReader,然后逐个读取每个行。

我想知道是否有更有效的方式(在控制线和可读性方面)做到这一点使用LINQ不影响运营效率。我所看到的例子包括加载整个文件到内存中,然后再处理它。在这种情况下,但是我不认为这将是非常有效的。在第一实施例中的文件可以得到高达大约50K,并在第二个例子中,不需要的文件中的所有行被读取(尺寸一般小于; 10K)。

您可能会说,现在它并不重要,这些小文件,但是我认为,排序方式会导致低效code。

感谢您的时间!

第一个例子:

  //打开文件
使用(var文件= System.IO.File.OpenText(_LstFilename))
{
    //读取文件
    而(!file.EndOfStream)
    {
        串行= file.ReadLine();        //忽略空行
        如果(line.Length大于0)
        {
            //创建插件
            ŧ插件=新T();
            addon.Load(线,_BaseDir);            //添加到收藏
            collection.Add(插件);
        }
    }
}

第二个例子:

  //打开文件
使用(var文件= System.IO.File.OpenText(datFile))
{
    //编译regexs
    正则表达式nameRegex ​​=新的正则表达式(标识(*)。);    而(!file.EndOfStream)
    {
        串行= file.ReadLine();        //检查名称
        匹配M = nameRegex.Match(线);
        如果(m.Success)
        {
            _Name = m.Groups [1] .value的;            //删除我在读其他值
            打破;
        }
    }
}


解决方案

您可以轻松地编写一个基于LINQ在线阅读器pretty使用迭代器块:

 静态的IEnumerable<&SOMETYPE GT; ReadFrom(字符串文件){
    串线;
    使用(VAR读卡器= File.OpenText(文件)){
        而((行= reader.ReadLine())!= NULL){
            SOMETYPE newRecord = / *解析行* /
            产生回报newRecord;
        }
    }
}

,或使乔恩开心:

 静态的IEnumerable<串GT; ReadFrom(字符串文件){
    串线;
    使用(VAR读卡器= File.OpenText(文件)){
        而((行= reader.ReadLine())!= NULL){
            产生回线;
        }
    }
}
...
VAR typedSequence =从ReadFrom线(路径)
                    让记录= ParseLine(行)
                    其中,record.Active //​​例如
                    选择record.Key;

那么你有 ReadFrom(...)因为没有缓冲的评价懒洋洋的顺序,非常适合其中,

请注意,如果你使用排序依据或标准 GROUPBY ,它会在内存中缓存数据; ifyou需要分组和聚集,PushLINQ有一些奇特的code,让您对数据进行汇总,但丢弃它(无缓冲)。乔恩的解释<一个href=\"http://$c$cblog.jonskeet.uk/2008/01/04/quot-push-quot-linq-revisited-next-attempt-at-an-explanation/\">is这里。

I am trying to read some text files, where each line needs to be processed. At the moment I am just using a StreamReader, and then reading each line individually.

I am wondering whether there is a more efficient way (in terms of LoC and readability) to do this using LINQ without compromising operational efficiency. The examples I have seen involve loading the whole file into memory, and then processing it. In this case however I don't believe that would be very efficient. In the first example the files can get up to about 50k, and in the second example, not all lines of the file need to be read (sizes are typically < 10k).

You could argue that nowadays it doesn't really matter for these small files, however I believe that sort of the approach leads to inefficient code.

Thanks for your time!

First example:

// open file
using(var file = System.IO.File.OpenText(_LstFilename))
{
    // read file
    while (!file.EndOfStream)
    {
        String line = file.ReadLine();

        // ignore empty lines
        if (line.Length > 0)
        {
            // create addon
            T addon = new T();
            addon.Load(line, _BaseDir);

            // add to collection
            collection.Add(addon);
        }
    }
}

Second example:

// open file
using (var file = System.IO.File.OpenText(datFile))
{
    // compile regexs
    Regex nameRegex = new Regex("IDENTIFY (.*)");

    while (!file.EndOfStream)
    {
        String line = file.ReadLine();

        // check name
        Match m = nameRegex.Match(line);
        if (m.Success)
        {
            _Name = m.Groups[1].Value;

            // remove me when other values are read
            break;
        }
    }
}

解决方案

You can write a LINQ-based line reader pretty easily using an iterator block:

static IEnumerable<SomeType> ReadFrom(string file) {
    string line;
    using(var reader = File.OpenText(file)) {
        while((line = reader.ReadLine()) != null) {
            SomeType newRecord = /* parse line */
            yield return newRecord;
        }
    }
}

or to make Jon happy:

static IEnumerable<string> ReadFrom(string file) {
    string line;
    using(var reader = File.OpenText(file)) {
        while((line = reader.ReadLine()) != null) {
            yield return line;
        }
    }
}
...
var typedSequence = from line in ReadFrom(path)
                    let record = ParseLine(line)
                    where record.Active // for example
                    select record.Key;

then you have ReadFrom(...) as a lazily evaluated sequence without buffering, perfect for Where etc.

Note that if you use OrderBy or the standard GroupBy, it will have to buffer the data in memory; ifyou need grouping and aggregation, "PushLINQ" has some fancy code to allow you to perform aggregations on the data but discard it (no buffering). Jon's explanation is here.

这篇关于C#读取文件一行行的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆