使用linq解析文本文件 [英] parsing text file using linq

查看:97
本文介绍了使用linq解析文本文件的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用LINQ解析文本文件,但是在它上面有结构,它超出范围异常

i am working parsing textfile using LINQ but got struc on it,its going outof range exception

string[] lines = File.ReadAllLines(input);
           var t1 = lines
               .Where(l => !l.StartsWith("#"))
               .Select(l => l.Split(' '))
               .Select(items => String.Format("{0}{1}{2}",
                   items[1].PadRight(32),
                   //items[1].PadRight(16)
                   items[2].PadRight(32),
                   items[3].PadRight(32)));
           var t2 = t1
               .Select(l => l.ToUpper());
           foreach (var t in t2)
               Console.WriteLine(t);



和文件大约200到500行,我想提取具体信息,所以我需要将这些信息拆分到不同的结构,所以如何做到这一点..


and file is about 200 to 500 lines and i want to extract specific information so i need to split that information to different structure so how to do it this..

推荐答案

我认为如果可能的话,最好选择正则表达式而不是linq。之后它的字符串/字符多于数据/表/列表。那么为什么不使用面向文本的解决方案呢?
I think better go for regex if possible rather than a linq. after its more of string/characters than data/tables/lists. so why not use something text oriented solution?


你需要解析一个文件,所以,起点不是LINQ(这是实现细节),而是定义文件的语法到所需的细节级别,以便实现一个足够好的解析器。



你的案例中的语法看起来像我一样:

{...} = 0-n次重复, [...] =可选, a | b = a或b,''...'' =文字文字)

You need to parse a file, so, the starting point is not LINQ (this is implementation detail), but rather defining the grammar of the file to the needed detail level to allow implememting a good-enough parser.

The grammar in your case looks to me somethin like:
( {...} = 0-n repetitions, [...] = optional, a | b = a or b, ''...'' = literal text)
file     : { line } .
line     : { ws } [ input | output | pin | data ] rest .
rest     : [ comment ] EOL .
comment  : { ws } '#' { NOT_EOL } .
ws       : SPACE_NOT_EOL .
input    : 'input' { ws } '=' { ws } number .
output   : 'output' { sw } '=' { ws } number .
pin      : 'pin' ws { ws } 'class' ws { ws } 'direction' ws { ws } 'no' .
data     : word ws { ws } number ws { ws } word ws { ws } number .
number   : DIGIT { DIGIT } .
word     : WORDCHAR { WORDCHAR } .



实施:


Implementing:

// 1:   input|output|pin|data...
// 2-4: class,direction,no if pin
// 2-4: number, dir, number if data...
Regex scan = new Regex(
              @"^\s*(?:(\w+)\s*(?:=\s*\w+|(\w+)\s+(\w+)\s+(\w+)))?\s*(?:[#].*)?



RegexOptions.Multiline);

var lines = scan.Matches(File.ReadAllText( @ < span class =code-string> .. \..\data.txt))
.Cast< Match>()
.Where(m => m .Groups [ 1 ]。成功)
foreach var m 行)
{
switch ( m.Groups [ 1 ]。值)
{
case input case output case pin // 忽略
break ;
默认
Console.WriteLine( {0,-16} {1,4} {2,-6} {3,4}
m.Groups [ 1 ]。值,
m.Groups [ 2 ]。值,
m.Groups [ 3 ]。值,
m.Groups [ 4 ]。值);
break ;
}
}
", RegexOptions.Multiline); var lines = scan.Matches(File.ReadAllText(@"..\..\data.txt")) .Cast<Match>() .Where(m=>m.Groups[1].Success) foreach (var m in lines) { switch (m.Groups[1].Value) { case "input": case "output": case "pin": // ignored break; default: Console.WriteLine("{0,-16} {1,4} {2,-6} {3,4}", m.Groups[1].Value, m.Groups[2].Value, m.Groups[3].Value, m.Groups[4].Value); break; } }



输入文件


With input file

# Header
input = 12
output = 4
# Data
pin class direction no     # Data Title
io    1      up        0
io    3      rught     1
cb    6      up        2
io    1      up        0
# End of data



结果


Results in

io                  1 up        0
io                  3 rught     1
cb                  6 up        2
io                  1 up        0



干杯

Andi


Cheers
Andi


这篇关于使用linq解析文本文件的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆