CSV文本文件分析器与TextFieldParser - MalformedLineException [英] CSV Text file parser with TextFieldParser - MalformedLineException

查看:341
本文介绍了CSV文本文件分析器与TextFieldParser - MalformedLineException的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

问题描述

我在一个CSV解析器使用C#的 TextFieldParser 类。

I am working on a CSV parser using C# TextFieldParser class.

我的CSV数据是由和该字符串由字符括起来。

My CSV data is deliminated by , and the string is enclosed by a " character.

不过,有时数据行的单元格还可以有一个这似乎使得解析器抛出一个异常

However, sometimes the data row cell can also have a " which appears to be making the parser throw an exception.

这是我的C#代码至今:

This is my C# code so far:

using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.IO;
using Microsoft.VisualBasic.FileIO;

namespace CSV_Parser
{
    class Program
    {
        static void Main(string[] args)
        {
            // Init
            string CSV_File = "test.csv";

            // Proceed If File Is Found
            if (File.Exists(CSV_File))
            {
                // Test
                Parse_CSV(CSV_File);
            }

            // Finished
            Console.WriteLine("Press any to exit ...");
            Console.ReadKey();
        }

        static void Parse_CSV(String Filename)
        {
            using (TextFieldParser parser = new TextFieldParser(Filename))
            {
                parser.TextFieldType = FieldType.Delimited;
                parser.SetDelimiters(",");
                parser.TrimWhiteSpace = true;
                while (!parser.EndOfData)
                {
                    string[] fieldRow = parser.ReadFields();
                    foreach (string fieldRowCell in fieldRow)
                    {
                        // todo
                    }
                }
            }
        }
    }
}

这是对我的内容 test.csv 文件:

" dummy test"s data",   b  ,  c  
d,e,f
gh,ij

什么是对付最好的办法 在我行单元格的数据?

What is the best way to deal with " in my row cell data?

更新

根据添Schmelter的的回答,我已经修改我的代码如下:

Based on Tim Schmelter's answer, I have modified my code to the following:

static void Parse_CSV(String Filename)
{
    using (TextFieldParser parser = new TextFieldParser(Filename))
    {
        parser.TextFieldType = FieldType.Delimited;
        parser.SetDelimiters(",");
        parser.HasFieldsEnclosedInQuotes = false;
        parser.TrimWhiteSpace = true;
        while (parser.PeekChars(1) != null)
        {
            var cleanFieldRowCells = parser.ReadFields().Select(
                f => f.Trim(new[] { ' ', '"' }));

            Console.WriteLine(String.Join(" | ", cleanFieldRowCells));
        }
    }
}

这似乎产生以下(正常):

Which appears to produce the following (correctly):

时这是处理字符串用引号括起来,有引号的最佳方式?

Is this is the best way to deal with string enclosed by quotes, having quotes?

推荐答案

能否通过设置省略引用字符 HasFieldsEnclosedInQuotes

Could you omit the quoting-character by setting HasFieldsEnclosedInQuotes to false?

using (var parser = new TextFieldParser(@"Path"))
{
    parser.HasFieldsEnclosedInQuotes = false;
    parser.Delimiters = new[]{","};
    while(parser.PeekChars(1) != null)
    {
        string[] fields = parser.ReadFields();
    }
}

您可以手动删除引号:

var cleanFields = fields.Select(f => f.Trim(new[]{ ' ', '"' }));

这篇关于CSV文本文件分析器与TextFieldParser - MalformedLineException的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!

查看全文
登录 关闭
扫码关注1秒登录
发送“验证码”获取 | 15天全站免登陆