将文本文件解析为CSV C# [英] Parsing a Text file to CSV C#
问题描述
我是C#开发的新手。我需要解析一个巨大的文本文件,每行包含几行数据。输出将是一个CSV文件。
I am new to C# development. I need to parse a huge text file containing several lines of data per line. The output will be a CSV file.
文件的格式遵循以下模式:
The format of the file follows the following pattern:
Acronym: TIFFE
Name of proposal: Thermal Systems Integration for Fuel Economy
Contract number: 233826
Instrument: CP – FP
#
Acronym: STREAMLINE
Name of proposal: Strategic Research For Innovative Marine Propulsion Concepts
Contract number: 233896
Instrument: CP – FP
其中#表示新记录。现在在这个文本文件中有数百个记录。我想能够将所有内容解析为CSV,其中包含Acronym,Proposal等的列,以及包含每条记录的实际数据的行。
where # stands for a new record. Now there are hundreds of 'records' in this textfile. I want to be able to parse everything to a CSV with columns for Acronym, Name of Proposal, etc. and the rows containing the actual data for each record.
有没有最好的方法如何尝试?
Is there a best way how to attempt this?
我猜我必须解析数据
推荐答案
这个简单的LINQ语句将您的输入文件解析为一个序列记录并以CSV格式将每个记录写入输出文件(假设每个记录中的字段数量和顺序相同):
This simple LINQ statement parses your input file into a sequence of records and writes each record in CSV format to an output file (assuming that the number and order of fields in each record is the same):
File.WriteAllLines("output.csv", File
.ReadLines("input.txt")
.GroupDelimited(line => line == "#")
.Select(g => string.Join(",", g
.Select(line => string.Join(line
.Substring(line.IndexOf(": ") + 1)
.Trim()
.Replace("\"", "\"\""), "\"", "\"")))));
输出:
"TIFFE","Thermal Systems Integration for Fuel Economy","233826","CP – FP"
"STREAMLINE","Strategic Research For Innovative Marine Propulsion Concepts","233896","CP – FP"
助手方法:
static IEnumerable<IEnumerable<T>> GroupDelimited<T>(
this IEnumerable<T> source, Func<T, bool> delimiter)
{
var g = new List<T>();
foreach (var x in source)
{
if (delimiter(x))
{
yield return g;
g = new List<T>();
}
else
{
g.Add(x);
}
}
yield return g;
}
这篇关于将文本文件解析为CSV C#的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!