CSV文本文件解析器与TextFieldParser - MalformedLineException [英] CSV Text file parser with TextFieldParser - MalformedLineException

查看:170
本文介绍了CSV文本文件解析器与TextFieldParser - MalformedLineException的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我正在使用C#

我的CSV数据由并且字符串由字符包围。



但是,有时数据行单元格也可以有这似乎使解析器抛出异常。





这是我的C#代码到目前为止:

 使用System; 
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using Microsoft.VisualBasic.FileIO;

命名空间CSV_Parser
{
类程序
{
static void Main(string [] args)
{
// Init
string CSV_File =test.csv;

//继续如果找到文件
if(File.Exists(CSV_File))
{
// Test
Parse_CSV(CSV_File);
}

//完成
Console.WriteLine(按any退出...);
Console.ReadKey();
}

static void Parse_CSV(String Filename)
{
using(TextFieldParser parser = new TextFieldParser(Filename))
{
parser .TextFieldType = FieldType.Delimited;
parser.SetDelimiters(,);
parser.TrimWhiteSpace = true;
while(!parser.EndOfData)
{
string [] fieldRow = parser.ReadFields();
foreach(string fieldRowCell in fieldRow)
{
// todo
}
}
}
}
}
}

这是我的 test.csv 文件:

 虚拟测试数据,b,c 
d,e,f
gh,ij

处理

>

根据 Tim Schmelter的回答,我已将我的代码修改为以下内容:

  static void Parse_CSV(String Filename)
{
using(TextFieldParser parser = new TextFieldParser(Filename))
{
parser.TextFieldType = FieldType.Delimited;
parser.SetDelimiters(,);
parser.HasFieldsEnclosedInQuotes = false;
parser.TrimWhiteSpace = true;
while(parser.PeekChars(1)!= null)
{
var cleanFieldRowCells = parser.ReadFields()。select(
f => f.Trim {'',''));

Console.WriteLine(String.Join(|,cleanFieldRowCells));
}
}
}

似乎产生以下(正确):


$ b b

>



解决方案

你可以通过设置省略引号字符 HasFieldsEnclosedInQuotes false

  using(var parser = new TextFieldParser(@Path))
{
parser.HasFieldsEnclosedInQuotes = false;
parser.Delimiters = new [] {,};
while(parser.PeekChars(1)!= null)
{
string [] fields = parser.ReadFields();
}
}

您可以手动删除引号:

  var cleanFields = fields.Select(f => f.Trim(new [] {'','')); 


I am working on a CSV parser using C# TextFieldParser class.

My CSV data is deliminated by , and the string is enclosed by a " character.

However, sometimes the data row cell can also have a " which appears to be making the parser throw an exception.

This is my C# code so far:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using Microsoft.VisualBasic.FileIO;

namespace CSV_Parser
{
    class Program
    {
        static void Main(string[] args)
        {
            // Init
            string CSV_File = "test.csv";

            // Proceed If File Is Found
            if (File.Exists(CSV_File))
            {
                // Test
                Parse_CSV(CSV_File);
            }

            // Finished
            Console.WriteLine("Press any to exit ...");
            Console.ReadKey();
        }

        static void Parse_CSV(String Filename)
        {
            using (TextFieldParser parser = new TextFieldParser(Filename))
            {
                parser.TextFieldType = FieldType.Delimited;
                parser.SetDelimiters(",");
                parser.TrimWhiteSpace = true;
                while (!parser.EndOfData)
                {
                    string[] fieldRow = parser.ReadFields();
                    foreach (string fieldRowCell in fieldRow)
                    {
                        // todo
                    }
                }
            }
        }
    }
}

This is the content of my test.csv file:

" dummy test"s data",   b  ,  c  
d,e,f
gh,ij

What is the best way to deal with " in my row cell data?


UPDATE

Based on Tim Schmelter's answer, I have modified my code to the following:

static void Parse_CSV(String Filename)
{
    using (TextFieldParser parser = new TextFieldParser(Filename))
    {
        parser.TextFieldType = FieldType.Delimited;
        parser.SetDelimiters(",");
        parser.HasFieldsEnclosedInQuotes = false;
        parser.TrimWhiteSpace = true;
        while (parser.PeekChars(1) != null)
        {
            var cleanFieldRowCells = parser.ReadFields().Select(
                f => f.Trim(new[] { ' ', '"' }));

            Console.WriteLine(String.Join(" | ", cleanFieldRowCells));
        }
    }
}

Which appears to produce the following (correctly):

Is this is the best way to deal with string enclosed by quotes, having quotes?

解决方案

Could you omit the quoting-character by setting HasFieldsEnclosedInQuotes to false?

using (var parser = new TextFieldParser(@"Path"))
{
    parser.HasFieldsEnclosedInQuotes = false;
    parser.Delimiters = new[]{","};
    while(parser.PeekChars(1) != null)
    {
        string[] fields = parser.ReadFields();
    }
}

You can remove the quotes manually:

var cleanFields = fields.Select(f => f.Trim(new[]{ ' ', '"' }));

这篇关于CSV文本文件解析器与TextFieldParser - MalformedLineException的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆