阅读csv有双重报价与lumenwork csv读者 [英] Reading csv having double quotes with lumenwork csv reader
问题描述
我正在使用Lumenworks csv阅读器读取csv文件。以下是示例记录
I'm reading a csv file using the Lumenworks csv reader. Below is an example record
"001-0000265-003"|"Some detail"|"detal1"|"detail2"|"detal3"|"detail4"|"detail5"|"detail6"
用下面的构造函数读取这个文件
I've created a class with below constructor to read this file
using (var input = new CsvReader(stream, true, '|'))
{
//logic to create an xml here
}
工作正常,当没有双引号里面的细节。但是当这样的情况下
This works fine when there is no double quotes inside details. But when the scinarios like this
"001-0000265-003"|"Some " detail"|"detal1"|"detail2"|"detal3"|"detail4"|"detail5"|"detail6"
异常
An unhandled exception of type 'LumenWorks.Framework.IO.Csv.MalformedCsvException' occurred in LumenWorks.Framework.IO.dll
所以我使用了CsvReader构造函数,它接受了7个参数,
So then I used the CsvReader constructor which takes 7 arguments,
CsvReader(stream, true, '|', '"', '"', '#', LumenWorks.Framework.IO.Csv.ValueTrimmingOptions.All))
但我仍然收到同样的错误,请提供任何建议。
But still I'm getting the same error. Please provide any suggestions.
我正在阅读一些复杂的申请,如下:
I'm reading some complex filed as follows,
"001-0000265-003"|"ABC 33"X23" CDE 32'X33" AAA, BB'C"|"detal1"|"detail2"|"detal3"|"detail4"|"detail5"|"detail6"
推荐答案
我已经测试了你的示例数据,很难修复这个格式错误的行fe从 Catch
-block)。所以我不会使用引号字符,而只是使用管道分隔符,并通过 csv [i] .Trim()删除
。 ''')
I've tested it with your sample data and it's pretty difficult to fix this malformed line(f.e. from the Catch
-block). So i would not use a quoting-character, but instead just use the pipe-delimiter and remove the "
later via csv[i].Trim('"')
.
以下是解析文件并返回所有行的字段的方法:
Here's a method that parses the file and returns all lines' fields:
private static List<List<string>> GetAllLineFields(string fullPath)
{
List<List<string>> allLineFields = new List<List<string>>();
var fileInfo = new System.IO.FileInfo(fullPath);
using (var reader = new System.IO.StreamReader(fileInfo.FullName, Encoding.Default))
{
Char quotingCharacter = '\0'; // no quoting-character;
Char escapeCharacter = quotingCharacter;
Char delimiter = '|';
using (var csv = new CsvReader(reader, true, delimiter, quotingCharacter, escapeCharacter, '\0', ValueTrimmingOptions.All))
{
csv.DefaultParseErrorAction = ParseErrorAction.ThrowException;
//csv.ParseError += csv_ParseError; // if you want to handle it somewhere else
csv.SkipEmptyLines = true;
while (csv.ReadNextRecord())
{
List<string> fields = new List<string>(csv.FieldCount);
for (int i = 0; i < csv.FieldCount; i++)
{
try
{
string field = csv[i];
fields.Add(field.Trim('"'));
} catch (MalformedCsvException ex)
{
// log, should not be possible anymore
throw;
}
}
allLineFields.Add(fields);
}
}
}
return allLineFields;
}
使用包含样本数据的文件进行测试和输出:
Test and output with a file that contains your sample data:
List<List<string>> allLineFields = GetAllLineFields(@"C:\Temp\Test\CsvFile.csv");
foreach (List<string> lineFields in allLineFields)
Console.WriteLine(string.Join(",", lineFields.Select(s => string.Format("[{0}]", s))));
[001-0000265-003],[Some detail],[detal1],[detail2],[detal3],[detail4],[detail5],[detail6]
[001-0000265-003],[Some " detail],[detal1],[detail2],[detal3],[detail4],[detail5],[detail6]
这篇关于阅读csv有双重报价与lumenwork csv读者的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持IT屋!