如何从一个文本文件中读取更快/更聪明? [英] How to read from a Text File Faster/Smarter?

查看:143
本文介绍了如何从一个文本文件中读取更快/更聪明?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我想知道是否有可能从文本文件更快更聪明的办法来阅读。

这是我的数据在文本文件的典型格式:

称之为部分:

  ID:1;
FIELD1:someText;
FIELD2:someText;
FIELD3:someText;
FIELD4:someText;
FIELD5:someText;
FIELD6:someText;
FIELD7:someText;
FIELD8:someText;
END_ID:
01:someData;
02:someData;
...
...
48:someData;
ENDCARD:
 

我有一个文本文件数千人。

是否可以使用 LINQ 来读这本书是部分,部分?我不通过每一行要循环。

是否有可能为 LINQ 来开始 ID:1; 和结束 ENDCARD:

这样做的原因是,我想创建一个对象为每个零件......

我有这样的事情这一点:

 字符串[]行= System.IO.File.ReadAllLines(SomeFilePath);

//清理不需要的文本的文本文件
VAR cleanedUpLines从线线=
                     在那里!line.StartsWith(FIELD1)
                     &功放;&安培; !line.StartsWith(FIELD5)
                     &功放;&安培; !line.StartsWith(FIELD8)
                     选择line.Split(:);

//在这里,我想用部分LINQtoText的一部分

//这个我不想做!
的foreach(字符串[]在cleanedUpLines行)
{
}
 

解决方案

在这里你去:

 静态无效的主要()
{
    的foreach(在ReadParts变种部分(Raw.txt))
    {//所有可用的部分领域;我只是展示
        //其中一人作说明
        Console.WriteLine(部分[ID]);
    }
}

静态的IEnumerable< IDictionary的<字符串,字符串>> ReadParts(字符串路径)
{
    使用(VAR读卡器= File.OpenText(路径))
    {
        无功电流=新字典<字符串,字符串>();
        串线;
        而((行= reader.ReadLine())!= NULL)
        {
            如果(string.IsNullOrWhiteSpace(线))继续;
            如果(line.StartsWith(ENDCARD:))
            {
                产量回流;
                目前=新字典<字符串,字符串>();
            } 其他
            {
                VAR部分= line.Split(:);
                电流[零件[0] .Trim()] =部分[1] .Trim()TrimEnd(';')。
            }
        }
        如果(current.Count大于0)收率返回电流;
    }
}
 

这样做是:创建一个迭代器块(一个状态机,读取和收益率的数据,因为它是重复,它不读取一气呵成整个文件)的扫描线;如果它是一个卡的端部,该卡得到;否则它添加数据到一个字典存储

请注意:如果你有自己的的再presents的数据,那么你可以使用类似反射或的FastMember 通过名称来设置的值。

这不直接使用LINQ;然而,它实现的的可枚举序列,这是积木LINQ到对象,所以你可以的消耗的这种使用LINQ,即

  VAR数据= ReadParts(some.file)跳过(2)。首先(X => X [ID] ==123)。
 

I want to know if it is possible to read from a text file in a faster and smarter way.

This is a typical format of my data in a text file:

Call this "part":

ID:1;
FIELD1 :someText;
FIELD2 :someText;
FIELD3 :someText;
FIELD4 :someText;
FIELD5 :someText;
FIELD6 :someText;
FIELD7 :someText;
FIELD8 :someText;
END_ID :
01: someData;
02: someData;
...
...
48: someData;
ENDCARD:

I have thousands of them in a text file.

Is it possible to use LINQ to read it "part" by "part"? I don't want to loop through every single line.

Will it be possible for LINQ to start at ID:1; and end at ENDCARD:?

The reason for this is that i want to create a object for every "part"...

I had something like this in mind:

string[] lines = System.IO.File.ReadAllLines(SomeFilePath);

//Cleaning up the text file of unwanted text
var cleanedUpLines = from line in lines
                     where !line.StartsWith("FIELD1")
                     && !line.StartsWith("FIELD5")
                     && !line.StartsWith("FIELD8")
                     select line.Split(':');

//Here i want to LINQtoText "part" by "part"

//This i do not want to do!!!
foreach (string[] line in cleanedUpLines)
{
}

解决方案

Here you go:

static void Main()
{
    foreach(var part in ReadParts("Raw.txt"))
    {   // all the fields for the part are available; I'm just showing
        // one of them for illustration
        Console.WriteLine(part["ID"]);
    }
}

static IEnumerable<IDictionary<string,string>> ReadParts(string path)
{
    using(var reader = File.OpenText(path))
    {
        var current = new Dictionary<string, string>();
        string line;
        while((line = reader.ReadLine()) != null)
        {
            if(string.IsNullOrWhiteSpace(line)) continue;
            if(line.StartsWith("ENDCARD:"))
            {
                yield return current;
                current = new Dictionary<string, string>();
            } else
            {
                var parts = line.Split(':');
                current[parts[0].Trim()] = parts[1].Trim().TrimEnd(';');
            }
        }
        if (current.Count > 0) yield return current;
    }
}

What this does is: create an iterator block (a state machine that reads and "yields" data as it is iterated; it does not read the entire file in one go) that scans the lines; if it is the end of a card, the card is "yielded"; otherwise it adds the data into a dictionary for storage.

Note: if you have your own class that represents the data, then you could use something like reflection or FastMember to set the values by name.

This does not use LINQ directly; however, it is implemented as an enumerable sequence, which is the building block of LINQ-to-Objects, so you could consume this with LINQ, i.e.

var data = ReadParts("some.file").Skip(2).First(x => x["ID"] == "123");

这篇关于如何从一个文本文件中读取更快/更聪明?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆