CSV文本文件分析器与TextFieldParser - MalformedLineException [英] CSV Text file parser with TextFieldParser - MalformedLineException
问题描述
我在一个CSV解析器使用C#的 TextFieldParser 类。
I am working on a CSV parser using C# TextFieldParser class.
我的CSV数据是由,
和该字符串由字符括起来。
My CSV data is deliminated by ,
and the string is enclosed by a "
character.
不过,有时数据行的单元格还可以有一个这似乎使得解析器抛出一个异常
However, sometimes the data row cell can also have a "
which appears to be making the parser throw an exception.
这是我的C#代码至今:
This is my C# code so far:
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using Microsoft.VisualBasic.FileIO;
namespace CSV_Parser
{
class Program
{
static void Main(string[] args)
{
// Init
string CSV_File = "test.csv";
// Proceed If File Is Found
if (File.Exists(CSV_File))
{
// Test
Parse_CSV(CSV_File);
}
// Finished
Console.WriteLine("Press any to exit ...");
Console.ReadKey();
}
static void Parse_CSV(String Filename)
{
using (TextFieldParser parser = new TextFieldParser(Filename))
{
parser.TextFieldType = FieldType.Delimited;
parser.SetDelimiters(",");
parser.TrimWhiteSpace = true;
while (!parser.EndOfData)
{
string[] fieldRow = parser.ReadFields();
foreach (string fieldRowCell in fieldRow)
{
// todo
}
}
}
}
}
}
这是对我的内容 test.csv
文件:
" dummy test"s data", b , c
d,e,f
gh,ij
什么是对付最好的办法
在我行单元格的数据?
What is the best way to deal with "
in my row cell data?
更新
根据添Schmelter的
的回答,我已经修改我的代码如下:
Based on Tim Schmelter's
answer, I have modified my code to the following:
static void Parse_CSV(String Filename)
{
using (TextFieldParser parser = new TextFieldParser(Filename))
{
parser.TextFieldType = FieldType.Delimited;
parser.SetDelimiters(",");
parser.HasFieldsEnclosedInQuotes = false;
parser.TrimWhiteSpace = true;
while (parser.PeekChars(1) != null)
{
var cleanFieldRowCells = parser.ReadFields().Select(
f => f.Trim(new[] { ' ', '"' }));
Console.WriteLine(String.Join(" | ", cleanFieldRowCells));
}
}
}
这似乎产生以下(正常):
Which appears to produce the following (correctly):
时这是处理字符串用引号括起来,有引号的最佳方式?
Is this is the best way to deal with string enclosed by quotes, having quotes?
推荐答案
能否通过设置省略引用字符 HasFieldsEnclosedInQuotes
到假
?
Could you omit the quoting-character by setting HasFieldsEnclosedInQuotes
to false
?
using (var parser = new TextFieldParser(@"Path"))
{
parser.HasFieldsEnclosedInQuotes = false;
parser.Delimiters = new[]{","};
while(parser.PeekChars(1) != null)
{
string[] fields = parser.ReadFields();
}
}
您可以手动删除引号:
var cleanFields = fields.Select(f => f.Trim(new[]{ ' ', '"' }));
这篇关于CSV文本文件分析器与TextFieldParser - MalformedLineException的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!